I've omitted the names of people except for presenters on the grounds of privacy. If anyone recognises themselves and would like the reference removed, please email me and I'll be happy to make the amendment.
Most of the notes on presentations were taken at the time and haven't been significantly revisited, hence some dodgy grammar or nonsensical sentences. On some topics my basic understanding was minimal and so the nonsense quotient will be higher. If you want accuracy, go read the proceedings.
Any opinions expressed are not those of my employers or the Open University. Heck, they may not even be mine.
In April of 2000 Jon Hall and I had a paper accepted for the
10th International Conference on Field Programmable Logic and
Applications, known as FPL 2000 to its aficionados. The acceptance
was for a poster presentation and a short paper to publish in the
proceedings. Since most of the screwy ideas in the paper were
mine, Jon providing most of the sensible stuff, I got to go to the
conference in Villach, Austria and defend it. This report is
about the conference, Villach, me, and how we all got on. Hopefully
it may prove useful to anyone going to a conference for the first
time, and unsure what to expect. And for any mad hikers who hang
around Villach for any length of time, come to that.
Saturday 26th
The problem with Austria is that it's a darn inconvenient place to get to. The best flight I could get had a 5am check-in for Saturday morning. Fortunately I stayed over with my little bro' in Datchet, 15 minutes from Heathrow Terminal 2, but it still meant getting up at 4:15am. Somehow I made it to Heathrow long term parking without driving into anything, got the bus and arrived at the terminal more or less on time. It worked out quite nicely, going more or less straight to the gate and then not having much of a wait before boarding.
Austrian Airlines provided a 737-400 or something similar, which was a bit smaller than the 747s I'd been used to on the trans-Atlantic routes, but reasonably comfortable - leg room is king from my perspective, and the check-in clerk had helpfully fixed me aisle seats for the outward flights. The flight to Vienna was pretty quick, at less than two hours even with a take-off delay. Apart from the breakfast (not quite up to British Airways or Virgin standards, but palatable) I spent most of the time poring over the in-flight magazine. The content was no less banal than any airline, but it was mostly in German with about 60% translated to English too. This was very handy as a crash course for getting my German back up to speed. I used to be pretty good at it, but time has rusted the old vocal cords and numbed the relevant neurons.
A brief stop-over in Vienna, then we were at the mercy of Tirolean Airlines for the hop to Klagenfurt. They provided a Dash 8-300 puddle-jumper, which I've always liked because it feels much more like actual flying than riding the front of a jet reaction. This flight was less than an hour, and the latter half of it showed us the peaks and lakes of the Alps which permeate the Carinthia region of Austria.
FPL'2000 had a welcome party ambush us in reception - about a quarter of the flight were conference attendees or their partners - and they shepherded us onto a bus for the 30 minute drive to Villach. We passed through beautiful Alpine scenery under a blue sky and blazing sun. It wasn't too hot - only the mid-20's Celsius - but just right.
The coach dropped us off at our respective hotels. I went to the Hotel Kasino along with a postgrad from Imperial. The hotel wasn't exciting, but comfortable. The guy on the main desk could have done with a humour transplant, though I did recall some piece of news from a couple of days before that anyone who speaks German is predisposed towards humourlessness because the facial expressions needed to pronounce umlauts cause frowning and hence a depressive attitude. And if you believe that, I've got this really nice bridge in London which I'd love to sell you.
I took a stroll around Villach, which isn't that large a city - 56,000 people, according to the FPL rep. The town (shopping) centre isn't much larger than that in my home town of Andover, though it is much nicer. The sunshine probably helps. I toured the town, got provisions for a hike tomorrow, managed to get my hair cut - now there's an acid test for confidence in a foreign language, if you're at all concerned about your appearance - and snacked at a pavement café on the main street, then headed up north out of town to tackle a nearby mountain.
The Oswaldiberg is 963m high, which is a fair peak by British standards, but Villach is far from sea level (about 480m). It took about 20 minutes to reach the mountain base from the Hotel Kasino, then another hour of climbing before reaching the top. A track and a metalled road intertwine all the way up, so you can choose your path. I opted for metalled road most of the way since I was only wearing trainers and was just warming up for Sunday. But the view across Villach to the southern mountain range was stunning. I knew I had to get there tomorrow, though it looked like it went up to 1400m+.
At the top of Oswaldiberg was a small chapel, perched on the peak of the berg with an observation platform around its steeple, but the entrance was unfortunately locked. If anyone's going up the berg, find out when the tower is open beforehand.
Coming down was a lot easier than going up. I did notice a sign warning that the woods covering the mountain were a rabies area, which was probably better to know later than earlier. Besides, I've had a rabies shot. Five years ago, but hey who's counting? (Courtesy of the National Blood Transfusion Service, if you're interested. Long story. Involves Romania and an over-eager NBTS doctor.)
After showering and changing I went wandering around town looking for food. Disappointingly, the plate of wurst and sauerkraut for which I'd been hungering was not available anywhere I looked. Instea I settled for pizza and beer at the Cafe Mosser, adjunct to the Hotel Mosser. Now there's a name that non-Germans had better be damn careful in pronouncing. (Hint: replace the 'o' with 'ö'.) But the food was good, and cheap; 200sch (less than £9) for two beers, large pizza and two good coffees. Plus, the waitress humoured my attempts at German. Recommended.
The most annoying thing about Villach is that practically everything
closes down come 12:30 on Saturday. This bodes poorly for any shopping
tomorrow, but at least MacDonalds will be open. On the plus side, most
things are reasonably priced. Having got 23ish schillings to the pound,
prices aren't prohibitive, though there doesn't seem to be the universal
acceptance of Visa/Mastercard here that exists in America and (to a lesser
extent) in Britain. I was starting to regret not having bought more
schillings than I did.
Sunday 27th August
The night passed relatively peacefully, sleeping like the proverbial piece of wood. Just the odd interruption from the intoxicated local youth, but hey a town needs its colour right? Breakfast was the traditional fruit, bread, cheese and meat. I remembered seeing in an article about Strasbourg that a German runs a café called "The King and Fool"; for breakfast you can either have the German style (The King's Breakfast) or the French (The Fool's Breakfast).
Duly carb-loaded, I packed up my rucksack and headed south out of town. It was a half hour trek on pavements to the bridge leading into Tschinowitz, then I changed to smaller roads and tracks that led me towards Finkelstein. The tracks took me over a low wooded hill which hides Finkelstein from Villach. As I descended the hill I thought "I'm not going to like coming up here at the end of the walk."
Finkelstein itself is an absolutely charming little village which backs onto the massive 1800m mountains south of Villach. Since Finkelstein is only at 550m itself, and the mountainsides are very steep, the sight is truly impressive. It's also very Alpine-rural, surrounded by fields and featuring the traditional cows with tinkling bells. I trekked through the sunny streets as the church bells tolled for the morning service. It was well over an hour's trek from Villach centre, and the early morning greyness had given way to another beautifully blue sky.
As I approached the mountains, large billboards appeared showing me the various routes up them. I followed the road which led around a massive outcrop of rock, "Kanzianiberg" or something like that. This presented huge rock faces ideal for climbing, though I noticed on a board giving international route ratings that most routes were between "very hard" and "Spiderman only need apply." The climbers sauntering by had the obligatory tinkling cluster of karabiners and friends hanging from their belts.
Finally the required track opened up, and I turned off the road and started up the mountain. The going was fairly easy at first, curving up through the woods, but soon I was out into the open on some big forestry road-making operation. The track took me off it pretty quickly, then up a steep slope and down a muddy track.
Just as I was thinking that I was now a fair way up the mountain and into the wilderness, the trees parted to reveal an Alpine meadow and on the other side of it a large house. So much for wilderness. The path took me past the house, which turned out to be part of a small holiday home development (very Alpine-lodge, though mostly brick rather than wood) and then to the base of a toboggan run. This ran by a ski-lift pulling the tobogganer and his wagon up to the top of a long field (facing backwards, presumably for psychological intimidation) then they were placed in a metal chute and let go. The more adventurous attained fairly impressive speeds near the end; this was not surprising given how high up the field went, and the trail followed it. By the time I reached the top I was well short of breath and reflecting that I should just have bought a toboggan ticket.
The trail did another one of its steep ascents along a precipitous path through the woods. Finally I had the impression that I was pretty high up; the outcropping on the other side of the steep valley was jagged and imposingly tall. Then the woods parted to reveal a shrub-covered valley that led up. Way, way up.
Fortunately the path zigzagged all the way, but the climb was hard going. The sun beat down and I was glad for a hat and sunscreen. On the plus side, the famed Alpine flowers were to be found sprouting on the sides of the path. Delicate flowers, delicate but beautiful colours. I was even lucky enough to find some wild raspberry plants. In the interests of conservation I restricted myself to one fruit, and it wasn't very big, but oh the taste! Truly a heavenly berry.
I rested for a moment at a small locked wooden hut which was apparently a customs post. Slovenia wasn't far away. Frankly, I don't know why they bothered. Anyone who hauls smuggled goods over these mountains truly deserves to avoid tax.
Finally a breakthrough; the valley came to an end, and a viewpoint enabled me to look down the mountainside to Finkelstein, and across to Villach. Very impressive, and they were a long way down. I didn't know how far up I was, but it felt good.
The track was now flatter, though still ascending steadily. The surrounding flora was very Alpine meadow, and in the sun could have been a British field on a warm summer day. I had to remind myself that I was likely higher than the summit of Ben Nevis.
Finally the path broke off from its relentless up, and hopped over a short spur to reveal a wooden house nestling in the crook of the mountains. This was the Mitzl-Moitzl-Hütte, a refuge for those mad souls makng multi-day hikes in the hills. It was locked, presumably so that only paying hikers could stop in, but afforded a wonderful view across Villach, over to the lakes in the east and the mountains in the north. There was also a rubber stamp affixed beside the door, which I used to mark the back page of my pocket German dictionary. It announced the altitude as 1525m; I had ascended just under a kilometer in 2.5 hours. No wonder I was knackered.
The trail continued up to the nearby peaks, and I was tempted to follow. They were only 200-300m higher, and I could clearly see details on them such as a large cross mounted on the peak to the west. But I checked my food, water and leg situation and opted for a descent. Maybe next time.
The descent itself was fairly uneventful, taking just under 2 hours to reach the edge of Finkelstein. By now my feet were really starting to ache, not the pain of blistering but just the "we've come a bloody long way and we're tired of it" ache. Ascending the gentle slope which I'd descended into Finkelstein was, as predicted, not enjoyable.
How I made the last kilometre into Villach I'm not quite sure - I remember clinging to a lamppost for support at one pedestrian crossing, but I got back to the hotel at just after 4pm, a 7-hour hike. Enjoyable, but my goodness it was hard work.
After a shower I was so knackered that I lay down and watched CNBC for a bit (yes, I really was that drained), but stirred my stumps to go looking for calorie replacement. A Bierhof across the road from where I ate last night provided the goods; dark Villach beer (excellent) and a skillet of meat, noodles, vegetables and strange fried things.
The conference centre was the next stop, where people were registering and then talking shop over drinks and snackage. I got my bagful of bumf, including the conference proceedings (I swear, it gets 50% heavier every year) and then joined a group of Imperial postgrads to catch up on things. Usefully, I learned that Steve McKeever had produced a semantics for Pebble, probably based on CSP. I resolved to drop him an email as we had Things To Discuss.
The bash went on until 9pm, then the conference centre staff started
politely herding us out. I went back home to rest my weary legs, skim
through the proceedings and crash out.
Monday 28th August
Waking up this morning was easy, getting up was less so. My legs felt as if they had done their work for the week, and wanted to rest. Cursing and dragging them until they begrudgingly started to work again, I sorted my gear for the conference.
The blue skies of the weekend had given way to a wet drizzle as I made my way to the conference centre. Clearly this had been arranged by the FPL organisers to stop people skiving off into the mountains. But the conference centre was well lit and comfortable; I made a preliminary scout around the industrial exhibitors to see what sort of things they were hawking. Pretty much everyone who was anyone in the FPGA field was there, though I didn't spot any Embedded Solutions bods on the first pass.
The student poster boards were just off the exhibition area; tall, narrow poster boards with our name and and abbreviated poster title indicating which board belonged to whom.
Come 9am we gathered in the main hall for the welcome. A photographer was huddled in one corner getting shots of the audience; 230-odd researchers and engineers this early in the morning can't have been a pretty sight. I hope they had a good airbrushing package if they were planning to sell the photos.
Profs Hartenstein and Grünbacher lurked up front with the early morning program on display behind them. Grünbacher kicked off the welcome with a brief upbeat message. Hartenstein did the thanks for the sterling work that the organisers had done, then presented some basic stats. He noted that 72% of attendees were from Europe, with Asia/South Pacific and America splitting the rest pretty much fifty-fifty. He gave a summary of FPL's history since it started at Oxford in 1991; originally the idea was to alternate between Oxford and the continent, though last year was in Glasgow and next year will be in Belfast.
Dr. Tsugio Makimoto, the corporate chief technologist of Hitachi, has written two books about the way the semiconductor industry has developed and how it's affected society and business. In this speech he looked at how the industry has gone in cycles between standardisation and customisation, and how the new field-programmable technology might develop.
Field-programmability was classified under 'standardisation' in "Makimoto's Wave", since it is standardised in manufacturing though customised in application.
Makimoto-san gave a summary of the cycles to date, approimately every 10 years from 1957 onwards. In the current (1997-2007) cycle, FPGAs have already exceeded gate arrays in production totals.
He got us to guess how long a pendulum with a ten-year period was; apparently it's 9.8 light years!
He speculated that "nomadic" tools such as mobile phones and PDAs could cover everything that 1980s home devices could do, showing the impressive growth rate of the digital consumer market. He pointed out that product lifecycles were shortening, and so a fast ramp-up was required for success, with only about 1 year of product life.
He described the Hitachi F-ZTAT micro (Zero Turn Around Time) which had Flash memory for its programming, enabling it to be incorporated into a wide variety of devices and custom-programmed. In a car, for instance, an F-ZTAT could be reprogrammed even at a service station.
He quoted the threshold of 50MHz, 50k gates and 50k production items below which PLDs are regarded as more cost-effective than FPGAs, though anticipated a move to 100, 100, 100 in the future.
He reviewed Altera and Xilinx's strategies. Altera has gone for complete system-on-chip, with either a soft reprogrammable core (NIOS) or hard core (ARM, MIPS). Xilinx proposes Network Reconfigurable Logic, where devices such as Virtex co-operate with software and network devices to reconfigure over the Net. This is proposed to save maintenance costs.
He looked at the new technologies such as FRAM, MRAM compared to Flash and DRAM; they look promising but aren't quite there yet.
Application-Specific Programmable Product is a new architecture for system-on-chip with embedded PLD blocks for user-customisation, with most of the part taken up by IP cores.
After FPL, what then? Makimoto-san believed that automated System on Chip/Package would be next, in the customisation side of the pendulumn, with e-business providing the infrastructure for development.
For the future, he looked at technology like Ruby Talk (automatic translation) and Dick Tracy's watch (e.g. MP3 player and digital camera) which are the "ultimate" in nomadic tools and already appearing in the market.
He closed off by envisaging an image of a nomadic communication scene; anytime, anywhere, with anyone, by any media.
Sriram Govindarajan from Cadence and U Cincinnati gave this talk, whose title was about 50% too long. Kicking off the technical talks is quite a burden, but he held up well. Usefully, he presented an overview slide so we knew where we were going. He was reasonably clear and spoke to the audience rather than to his notes or the screen.
The nub of the talk was the task of partitioning a task across multiple FPGAs (spatial) or on the same FPGA at different time (temporal), generating configuration data for reconfigurable components and a controlling program for a controller.
SPARCS is a synthesis tool developed at U Cincinnati. The design is fed in as a behavioural spec, temporal partitioned, spatial partitioned, then software is generated. The talk focuses on the synthesis framework which does this partitioning. There is high-level evaluation to partition, then device-specific synthesis to produce programming data for the target devices.
He reviewed related work on partitioning and exploration, basically saying that it was very hard and prone to design space explosion. What his team did was to use a heuristic multi-step exploration with knowledge of partitioning. So it wasn't precise, but could be effective if it got things right. They keep a schedule for each task in the program, and use these schedules to work out an overall schedule.
Some of the slides were a bit full, and it was hard to track what was going on. Some gesturing with a pointer may have made things clearer.
As an example, the partitioner finds something with a long latency, which might violate interconnect constraints, and trades off area for latency (these are viewed as orthogonal) to try to stop the violation. But tight links between blocks can screw this up, so exploration techniques must know how tasks are linked so that it knows to change linked tasks together.
He outlined the algorithm and described the design cost functions used (partition area violation, task criticality and latency-area tradeoff).
The temporal partitioner in SPARCS uses inductive logic programming, aiming to minimise the sum of the temporal schedule and latency contraints.
The spatial partitioning can be either a genetic algorithm or simulated annealing. According to a test, GA performed slightly better but not by much, for both traditional and the proposed exploration models.
He summarised the work done, and took questions; only one in fact, but handled it well.
At coffee break I wandered around the industrial stands and spoke to a chap at Actel. They do, among other things, devices suitable for milspec / space environments. I asked whether they had their synthesis software verified to similar reliability levels, but it turns out that they principally use third party tools such as Synopsys's Simplicity, so the question didn't really arise.
I also spoke to Altera about their use of ARM cores in the new Excalibur chip. They're using the ARM 9, though the engineer was unclear whether you could bolt on an FPU too.
I picked the Prototyping track for the pre-lunch session, as the alternative was Network Processors which don't really intersect my areas of interest.
Another title which could do with shortening. How about "Faster Faking of FP Comms for Embsys Design?" Dr. Jürgen Becker from Darmstadt presented this talk.
The motivation was that comms often bottlenecks complex systems. There are a wide range of ways of organising comms. What's best?
DICE (Darmstat's own tool) models comms via synchronous or asynchronous send/receive opts. Processes are spec'd in VHDL and C. To synthesise the comms, it generates a description of the comms topology (what talks to what) and replaces the abstract calls with accesses to shared resources.
REPLICA is a reconfigurable comms structure, the target of the comms synthesis. The synthesis can draw on a library of resources, which have performance data associated with them. The result is monitored in hardware, possibly in real time, and optimised accordingly. If it's really screwed up then you might have to repartition the whole thing and retry.
[General complaint: why produce complex equations on slides when you don't even start to explain the terms? They're meaningless to most of the audience. Save it for the written paper!]
The synthesis groups transfers where it can, generates an init file for comms synthesis, edits the constraints such as real-time constraints given by user or real-time environment, then runs synthesis algorithms mapping processes to processing elements (PEs). If enough I/O ports are available then it generates the comms structure directly, otherwise it generates a bus-based comms system.
The synthesis times are order of magnitude of a second for some smallish test structures on a 300MHz UltraSparc II.
He presented some simple examples and a pseudocode transfer merging algorithm for mapping time-independent transfers onto the same bit of hardware. There was an example of how this transfer merging worked on a smallish example. There is also some bus optimisation which affects how access to the bus is prioritised.
The goal is a baseband single chip transceiver running on a Virtex device. See talk tomorrow!
Helena Krupnova presented her talk as an overview of the current solutions for FPGA-based emulation and rapid prototyping. The talk came out of the INP Grenoble training program on SoC design using design and prototyping platforms.
Currently 60-80% of the design cycle is spent in the verification step. Lots of different approaches: formal verification, simulation and accelerated simulation, emulation, and rapid prototyping. Emulation and rapid prototyping is 5-6 orders of magnitude faster than simulation, allowing verification in real time and parallel verification of HW and SW with early integration.
Three main types of prototyping platforms: high capacity commercial emulators, semi-custom platforms, and the full-custom platforms (lots of them!).
To choose which one you'll use there's a big checklist that Krupnova listed, principally divided into Architecture, Debugging facilities, Compilation chain and Cost.
The big commercial emulators have hundreds of FPGAs, complex partitioning problem accordingly, with a typical filling ratio of about 30% and frequency of 1-3MHz. There's good visibility of what goes on inside and automatic software flow, but costs about $1 per emulation gate (ouch!). Krupnova described the QuickTurn, IKOS and Mentor Graphics products in detail, which are commercial, semicustom and full custom respectively.
Krupnova presented a tradeoff analysis graph, which was unfortunately very hard to interpret meaningfully.
Rapid prototyping systems have 10-20 FPGAs with 60-80% utilisation, latest FPGAs, 20-30MHz performance, but allows more limited probing than full commercial, and design is more semi-automatic.
An example of this is the Aptix MP3 / MP4 which Krupnova described, which allows switching of FPGA models as required, another example is Simutech's RAVE platform.
Custom boards are classified in many different ways e.g. single/multiple devices, architecture, origin, application domain, routing architecture and distribution. University systems are usually for prototyping and research. Vendor boards allow testing of the latest devices, prototyping small hardware and educational uses. Some prototyping boards include microprocessors.
Andreas Pyttel from Infineon aimed to introduce us to Infineon's approach to prototyping.
Requirements go into a functional design, which is specified and simulated, then into architectural design which is partitioned and architectured, then to integration where synthesis occurs and co-simulation / debugging happens. Validation can cause backward steps at any time.
Prototyping only happens at the functional level where only system behaviour is important, aiming to present the customer with something to play with. The prototype model is an algorithm in an FPGA, and a C program on a PC. The program configures the FPGA and writes/reads data via shared memory over PCI. They use a RC1000-PP card from Embedded Solutions, placed in a PC linked up to a logic analyser. The board has a Virtex 1000 FPGA, uses four banks of share memory and semaphores to control mutex on this.
They use Handel-C for programming their FPGAs. Pyttel summarised its syntax and differences from vanilla C. It allows parallelism, supports a range of memory and interface types, and supports a variety of Xilinx and Altera targets. It can also interface to VHDL or similar generated blocks.
Pyttel gave an example of an IR filter where there was a requirement for the filter itself to be in VHDL, so they used Handel C to describe the interface to this VHDL, with C macros acting like the function calls to the VHDL.
The design flow starts with the memory interface and algorithm + interface template separate; they both get to EDIF netlists eventually, which are merged using Xilinx place and route tools.
They found that it worked, the combination of Handel C and VHDL worked well and easy easy to do, and the platform itself was well suited to rapid prototyping.
We actually had a question on this one, with a chap wanting to know what happens if a VHDL library macro had a problem in it. Twofold answer; Handel C has a debugger, and you should have tested library modules thoroughly before using it!
Dr. Cantó from TU Catalunya was looking at dynamic reconfiguration of FIPSOC device to implement virtual circuits.
FPGA custom computing machines are aimed to run a wide variety of apps by being easy to configure. FIPSOC has on-chip memory, a microprocessor core, digital and analogue cells, interfaces to these, and connections to external memory and other devices. The digital cells have two contexts, with physical isolation between them. FIPSOC is fast because it can partially-reconfigure in two write cycles.
When you want to execute a large circuit that won't physically fit on one device, split it into temporal partitions, each partition mapped to a context of the device. Signals between contexts need to be buffered.
He described how combinatorial and sequential elements are partitioned, ensuring that data isn't dropped between partitions. It seems to be that they swap contexts at regular intervals, rather than only occasionally, like time-slicing compared to co-operative multitasking.
They validate the partitioing by simulation. It wasn't clear what his examples were supposed to be generating. A bit more explanation for us software schmucks would have been very handy.
Shared hardware gave a cost-effective solution, using a static D-latch to buffer combinational nets and a dynamic D-flipflop to buffer sequential nets.
They are working on implementing this partitioning technique on the CAD flow of FIPSOC devices.
Question about how to handle feedback in sequential nets: no problem, because splitting a macro cycle into micro cycles will mean that the micro cycles always do what the macro cycle did.
I was starving by the time lunch rolled around, but they'd laid on a reasonably good buffet for starters. I gobbled it quickly then headed up to set up my poster. Thanking my lucky stars that I'd brought along my own roll of sellotape, I wrestled with the poster board until I'd got all the information up there in the right order, lined up approximately right. A few of the guys setting up had brought their poster in a custom carrying case, which is definitely an option to investigate next time.
Half an hour after lunch started the caterers started to produce the main course, chicken breast stuffed with polenta, but I couldn't really be arsed as the queues were still substantial. Two hundred people going for a single-redundant buffet - it was never going to happen quickly. Bandwidth bottleneck!
The poster display kicked off at 2pm, when most people were still stuffed from lunch and feeling the after-effects of the copious amounts of alcohol served there. There was an initial buzz of visitors, then it quietened down as most people went to the main presentations, but as the coffee break appeared the interest picked up again. I had a reasonable number of questions on my topic, and learned quite a bit about my fellow poster exhibitors were up to. A gentleman from NTT had a very interesting poster based on their new Plastic Cell Architecture.
By the end of the two hours my legs had practically locked up, so I was glad to be able to sit down and stretch. The final session kicked off at 4:30pm, and I went to the Technology Mapping track.
Russell Tessier from U Massachusetts opened the session. Hybrid FPGAs contain both LUTs and conventional PLAs in one device. The APEX20KE is a device on this model. The talk was about identifying parts of a program suitable for mapping to the PLA part, then mapping the selected parts onto the available resources.
Tessier showed the basic architecture of the APEX device, with LUTs and PLA blocks hanging off global interconnect, with some local interconnect too. He summarised previous related work by Kaviani which looked at PLAs with a small number of outputs, but noted that the approach doesn't scale well to a large number of outputs.
The PLA within the APEX deice has 32 inputs, 32 product terms and 16 outputs, way more than Kaviani's approach could cope with.
Basic approach is to represent the design as a directed graph, traverse it to locate candidate subgraphs for PLA, evaluate these subgraphs for input, output and Pterm counts, and combine them into larger subgraphs to fill up the PLAs.
He described the top-down search of the graph, looking for subgraphs where the number of terminal outputs fil the number available in the PLA, then going backwards to find the maximum number of inputs in the PLA.
For each node we start up, we can identify one subgraph to be implemented in a PLA.
A statistical survey indicated that 32 product terms was about the optimum for 32 inputs, 16 outputs. You can minimise the number of product terms by complementing certain inputs.
To combine subgraphs they've written a cost function which expresses the desire to keep the number of inputs high and number of Pterms low, and using a greedy algorithm to maximise ranks.
Comparing their Estimator with commecial Espresso, it's very close in terms of estimated Pterms but often in a 20th of the time. Even for small numbers of outputs, Estimator got close to what Kaviani did.
It was mapped onto the APEX20E OK, and reduced LUT count by 9% by using the PLAs.
Jörn Abke, whom I'd met at lunch, was next up. CoMGen is the Configurable Module Generator with a generic logic block model, implementing as a Verilog module library. He was presenting a rapid prototyping design flow using CoMGen.
He described the traditional design flow of synthesis, partitioning, low-level technology mapping and then complex place and routing algorithms. An optimised design flow, PuMA, integrates a module generator system with a high level partitioner and a floorplanner. The routing is done separately. This enables use of special FPGA features and heterogeneous prototyping systems.
Abke showed a detail flowchart for the CoMGen system, starting with a Verilog Input / Module request and ending up with a placed vendor-specific netlist. CoMGEN makes architectures selectable, mapping arbitrary components, and is apparently easy to configure.
The FPGA configuration uses a generic logic block model (three basic types) with LUT-based FPGAs along with a carry-mode option. It is vendor- and series-independent, with a back-end netlist converter for the target device.
They have implemented objects such as the Braun and Pezaris multipliers, divider, multiplexor and combinatorial shift. For packing and decomposition, logic block packing is done by clustering, and for decomposition they use a bin-packing.
Compared to Xilinx M1.5, CoMGen was pretty much the same (some a little better, some a little worse) on a set of benchmarks, but on runtime they are much better.
In the future they want to extend the module libraries, improve placement, look at FSM mapping and add capabilities to do inter-module optimisation.
Jörn had to deal with questions from a Xilinx rep in the audience (well, he had said that he could do better than their tools!), but handled them quite well.
Prof. Shashi Kumar of Jönköping University gave this presentation. When you want to put a large circuit into reprogrammable logic, you might not be able to fit it onto a single FPGA, so you need to consider how you connect multiple FPGAs to hold it. They're gone for hybrid (FPGA-FPGA and FPGA-reprogrammable interconnect-FPGA) connections.
Starting with a circuit they technology-map it, partition, embed it using the "MFB architecture", then work out the Inter- and Intra-FPGA routing. The trade-off is flexibility vs. number of wires.
Where you embed FPGAs in an asymmetric system may affect whether you can actually route it, because the resources you need might be in the wrong place (Duh!) But if you get a good embedding which is routable, you can use this as a start to minimise the number of pins, reducing delay.
Variations on the embedding problem:
He's tackling the first of these problems.
A topology-specific embedded tool uses heuristics but is very sensitive to the start solution. There's no tool which treats the topology as a parameter; the problem has not been formulated.
For the MFB architecture they assume each FPGA has same number of neighbours, connected with same number of wires, and connected in the same manner. A single logical FPID handles all programmable connections.
The first step is to work out the minimum number of programmable pins needed for a particular part of the circuit, and use this to get an initial embedding for a heuristic and a bounding value when computing the optimal solution.
The heuristic repeatedly picks up the part in an FPGA k which uses the maximum number of programmable connections and tries to swap parts stored in neighbouring FPGAs with those in FPGAs unconnected to k.
The optimal algorithm is a Branch and Bound search.
They use a circuit generator of MCNC benchmark circuits on their experimental setup, with 16-25 100Kgate FPGAs as the target.
The results weren't very clear though a 50% reduction from random embedding appeared to be claimed.
I couldn't understand the details of either question which was asked about this talk.
Xilinx can be counted on to present at FPL, and this year was no exception. Jason Anderson picked up the baton for this particular session. He noted that designs may use more than 1 I/O standard; Virtex supports 20 I/O standards, with resources grouped into banks, and there are restrictions about these groups. Hence, there's a constrained placement problem.
The restrictions are due to required voltages. (Warning! Low level stuff.) If each block has access to multiple reference voltages then it's flexible but then you're tying up lots of pads receiving reference voltages.
So instead, group IO blocks by banks where each bank has a single instance of each reference voltage. But banks can make placement difficult. You want the core logic up close to the banks that it talks to; cut down on the length of wires required. Not a trivial problem. They have a solution.
Uses simulated annealing to start with. If the I/O placement is not OK then they try constructive I/O packing. In annealing, cost is affected by wire length, timing and banking violation. The latter is a sum of the cost for each bank, which in term is the amount of conflict in it; reference voltage conflicts (IOBs requiring multiple ref voltages in one bank) and output voltage conflicts (IOBs requiring multiple output voltages in one bank).
No guarantee than annealing will find a legal solution!
The I/O improvement uses a weighted bipartite matching approach. You define costs for moving objects around in terms of the change in wirelength cost, change in timing cost and a fudge factor to stop any voltage conflict arising. The algorithm finds a minimum cost matching, if non-infinite then it is feasible. It can fix minor violations of the banking rule.
Constructive packing binpacks I/Os into banks in a way we know is feasible, then re-execute the I/O improvement phase on this baseline.
They checked this out on a set of customer circuits with multiple I/O standards. The increased wirelengths give the I/O constraints only went up by a few percent. So it seems to work OK.
Is there a name for people who congenitally ask questions at the end of talks? A polite one, I mean.
Holger Kropp from U Hannover was looking at data compression techniques. Huffman coding is built up in a tree-like way to generate a minimum-length encoding of data. He gave an example of how it is used.
The encoder/decoder sequence needs a code tree for one of them, and a reverse code tree for the other. Aim is to derive an efficient mapping of such reverse code trees onto LUT-based FPGAs. It's based on Mukherjee et al where he maps n nodes onto n simple cells of gates and flip flops.
Each node is mapped onto a basic cell, cell type depending on node type. Single nodes and leaves only need one FF, all others onto OR-gate switches. Also need an entry point decoder, and a large OR to get the data out.
A token is inserted into the tree and passes up one LUT at a time; only one token is in the tree at a time. If it reaches the top cell then the signal is encoded and the next encoding starts.
But it's not so good for different LUT sizes. Instead they adapted the basic logic element (BLE) with a k-input LUT, a FF and a mux. They attain increase of the inputs by merging nodes. The tree is rotated, partitioned by depth and entry points, and then they add a cut line to separate out the set furthest from the root.
Repeat until the tree is divided into p partitions, then for each partition merge the nodes at the same depth into one BLE and connecting the BLEs appropriately.
Implemented in C++, and a netlist generated. Improvements of 12-40% achieved for different VLC tables compared to the direct mapping approach, with greater gains for larger tables and for wider BLEs.
Phew! A long day was over. I gathered my stuff and dumped it back at the hotel, leaving my poster up at the conference centre for posterity. The day's rain had given way, just in time, to an evening sun that was breaking through the clouds. Some of the 2000m peaks off to the southeast were visible above what cloud remained, impressively clear and barren. Got to get up those one of these days when I'm properly equipped and with like-minded lunatics.
Alerted by a lack of the ubiquitous credit card agency signs, I questioned the lady at the front desk (another of the hotel's star candidates for "Brightest Smile in Villach") to discover that no, Man's flexible friend did not flex in this area of town. Cash or broken kneecaps, that was the choice. Arse. Fortunately a nearby cash point was persuaded by my Visa to cough up the readies, though I don't like to think what my bank will charge me for the privilege.
Needing liquid recovery, I retired to last night's Bierhof to find out whether the Villacher Dunkel was as good as I remembered. It was; dark, sweet and malty. I'd sell my sister for a keg of the stuff, though admittedly I don't have one. Perhaps this is just as well. They also provided excellent beef wrapped in ham, and tasty apple strudel in custard. Definitely a recommended joint.
Looking back on the conference, the first day alone has been worth the cost of attending. About 50% of the talks were outright interesting, with most of the rest providing something useful. I've given a poster, not found it too scary, and had useful chats with a number of people from academe and industry. Tomorrow I can relax a bit and see more of the attendees. At least no-one - yet - has come up to me and said "by the way, you do know that X has already done what you're trying to do?" which was my private nightmare.