1、1,Kurt Keutzer,Lecture 26a: Software Environments for Embedded Systems,Prepared by: Professor Kurt Keutzer Computer Science 252, Spring 2000 With contributions from: Jerry Fiddler, Wind River Systems, Minxi Gao, Xiaoling Xu, UC Berkeley Shiaoje Wang, Princeton,2,Kurt Keutzer,SW: Embedded Software To
2、ols,C P U,ROM,RAM,A S I C,A S I C,RTOS,a.out,Application,software,simulator,compiler,application,source,code,debugger,U S E R,3,Kurt Keutzer,Another View of Microprocessor Architecture,Lets look at current architectural evolution from the standpoint of the software developers , in particular Jerry F
3、iddler,4,Kurt Keutzer,Fiddlers Predictions for the Next Ten Years (2010),End of the “Age of the PC” Lots of Exciting Applications Development Will Continue To Be Hard Even as we and our competitors continue to make incredible efforts Chips - No predictions MEMS / Nano-technology & Sensors Will Impac
4、t Us,J. Fiddler - WRS,5,Kurt Keutzer,Fundamental Principles,Computers are, and will be, everywhere The world itself is becoming more intelligent Our infrastructure will have major software content Most of our access to information will be through embedded systems Economics will inexorably drive depl
5、oyment of embedded systems The Internet is one important factor in this trend Reliability is a critical issue EVERY tech and mfg. business will need to become good at embedded software,J. Fiddler - WRS,6,Kurt Keutzer,What Will Be Embedded in Ten Years?,Everything That is Now Electro-Mechanical Machi
6、nes (Nano-Machines) Analog Signals Anything that communicates Lots of stuff in our cars Our Bodies Today - Pacemakers Soon - De-Fibrillators, Insulin Dispensers We can all be the $6M Person, for a lot cheaper All sorts of interfaces Speech, DNI, etc.,J. Fiddler - WRS,7,Kurt Keutzer,Embedded Micropro
7、cessor Evolution,Embedded CPU cores are getting smaller; 2mm2 for up to 400 mHz Less than 5% of CPU size Higher Performance by: Faster clock, deeper pipelines, branch prediction, . Trend is towards higher integration of processors with: Devices that were on the board now on chip: “system on a chip”
8、Adding more compute power by add-on DSPs, . Much larger L1 / L2 caches on silicon,J. Fiddler - WRS,8,Kurt Keutzer,680x0,CPU32,PowerPC,29k,680x0,CPU32,80x86,SPARC,MIPS R3k,i960,Microprocessor Chaos,ST 20,M32 R/D,StrongARM,ARM,SH-DSP,SH 4,MCORE,1980,1990,1996,1998,68000,80x86,MIPS 3k/4k/5k,SPARC,SH 1/
9、2/3,29k,RAD 6k,Siemens C16x,NEC V8xx,PARISC,i960,563xx,680x0,CPU32,PowerPC,80x86,MIPS 3k/4k/5k,SPARC,SH 1/2/3,29k,RAD 6k,Siemens C16x,NEC V8xx,PARISC,i960,563xx,J. Fiddler - WRS,9,Kurt Keutzer,A Challenging Environment,J. Fiddler - WRS,10,Kurt Keutzer,New Hardware Challenges Software Development,Mor
10、e & More Architectures User-Customizable processors More Power Demands More Software Functionality Software is not following Moores law (yet) System-on-a-chip DSP,J. Fiddler - WRS,11,Kurt Keutzer,Embedded Software Crisis,Cheaper, more powerful Microprocessors,More Applications,Increasing Time-to-mar
11、ket pressure,Bigger, More Complex Applications,Embedded Software Crisis,J. Fiddler - WRS,J. Fiddler - WRS,12,Kurt Keutzer,SW: Embedded Software Tools,C P U,ROM,RAM,A S I C,A S I C,RTOS,a.out,Application,software,simulator,compiler,application,source,code,debugger,U S E R,13,Kurt Keutzer,Outline on R
12、TOS,Introduction VxWorks General description System Supported processors Details Kernel Custom hardware support Closely coupled multiprocessor support Loosely coupled multiprocessor support pSOS eCos Conclusion,14,Kurt Keutzer,Embedded Development: Generation 0,Development: Sneaker-net Attributes: N
13、o OS Painful! Simple software only,15,Kurt Keutzer,Embedded Development: Generation 1,Hardware: SBC, minicomputer Development: Native Attributes: Full-function OS Non-Scalable Non-Portable Turnkey Very primitive,16,Kurt Keutzer,Embedded Development: Generation 2,Hardware: Embedded Development: Cross
14、, serial line Attributes Kernel Originally no file sys, I/O, etc. No development environment No network Non-portable, in assembly,17,Kurt Keutzer,Embedded Development: Generation 3,Hardware: SBC, embedded Development: Cross, Ethernet Integrated, text-based, Unix Attributes Scalable, portable OS Incl
15、udes network, file & I/O sys, etc. Tools on target Network required Heavy target required for development Closed development environment,18,Kurt Keutzer,Embedded Development: Generation 4,Hardware: Embedded, SBC Development: Cross Any tool - Any connection - Any target Integrated GUI, Unix & PC Attr
16、ibutes Tools on host No target resources required Far More Powerful Tools (WindView, CodeTest, ) Open dev. environment, published API Internet is part of dev. environment Support, updates, manuals, etc.,19,Kurt Keutzer,Embedded Development: Generation 5?,Super-scalable Communications-centric Virtual
17、 application platform Java? Multi-media Way-cool development environment Much easier to create, debug & re-use code Easy for non-programmers to contribute,20,Kurt Keutzer,The RTOS Evolution,*Percent of total software supplied by RTOS vendor in a typical embedded device,Kernel,21,Kurt Keutzer,Introdu
18、ction to RTOS,Wind River Systems Inc. VxWorkshttp:/Integrated Systems Inc. pSOShttp:/Cygnus Inc. = RedHat eCoshttp:/ = ,22,VxWorks,VxWorks,VxWorks 5.4 Scalable Run-Time System,23,VxWorks,Supported Processors,PowerPC 68K, CPU 32 ColdFire MCORE 80x86 and Pentium i960 ARM and Strong ARM MIPS SH,SPARC N
19、EC V8xx M32 R/D RAD6000 ST 20 TriCore,24,VxWorks,Wind microkernel,Task management multitasking, unlimited number of tasks preemptive scheduling and round-robin scheduling(static scheduling) fast, deterministic context switch 256 priority levels,25,VxWorks,Wind microkernel,Fast, flexible inter-task c
20、ommunication binary, counting and mutual exclusion semaphores with priority inheritance message queue POSIX pipes, counting semaphores, message queues, signals and scheduling control sockets shared memory,26,VxWorks,Wind microkernel,High scalability Incremental linking and loading of components Fast
21、, efficient interrupt and exception handling Optimized floating-point support Dynamic memory management System clock and timing facilities,27,VxWorks,Board Support Package,BSP = Initializing code for hardware device + device driver for peripherals BSP Developers Kit,BSP,28,VxWorks,VxMP,A closely cou
22、pled multiprocessor support accessory for VxWorks. Capabilities: Support up to 20 CPUs Binary and counting semaphores FIFO message queues Shared memory pools and partitions VxMP data structure is located in a shared memory area accessible to all CPUs Name service (translate symbol name to object ID)
23、 User-configurable shared memory pool size Support heterogeneous mix of CPU,29,VxWorks,VxMP,Hardware requirements: Shared memory Individual hardware read-write-modify mechanism across the shared memory bus CPU interrupt capability for best performance Supported architectures: 680x0 and 683xx SPARC S
24、PARClite PPC6xx MIPS i960,30,VxWorks,VxFusion,VxWorks accessory for loosely coupled configurations and standard IP networking; An extension of VxWorks message queue, distributed message queue. Features: Media independent design; Group multicast/unicast messaging; Fault tolerant, locale-transparentop
25、erations;Heterogeneous environment. Supported targets: Motorola: 68K, CPU32, PowerPC Intel x86, Pentium, Pentium Pro,31,pSOS,pSOS,pSOS 2.5,32,pSOS,Supported processors,PowerPC 68K ColdFire MIPS ARM and Strong ARM X86 and Pentium i960 SH,M32/R m.core NEC v8xx ST20 SPARClite,33,pSOS,pSOS+ kernel,Small
26、 Real Time multi-tasking kernel; Preemptive scheduling; Support memory region for different tasks; Mutex semaphores and condition variables (priority ceiling) No interrupt handling is included,34,pSOS,Board Support Package,BSP = skeleton device driver code + code for low-level system functions each
27、particular devices requires,35,pSOS,pSOS+m kernel,Tightly coupled or distributed processors; pSOS API + communication and coordination functions; Fully heterogeneous; Connection can be any one of shared memory, serial or parallel links, Ethernet implementations; Dynamic create/modify/delete OS objec
28、t; Completely device independent,36,eCos,eCos,37,eCos,Supported processors,Advanced RISC Machines ARM7 Fujitsu SPARClite Matsushita MN10300 Motorola PowerPC Toshiba TX39 Hitachi SH3 NEC VR4300 MB8683x series Intel strong ARM,38,eCos,Kernel,No definition of task, support multi-thread Interrupt and ex
29、ception handling Preemptive scheduling: time-slice scheduler, multi-level queue scheduler, bitmap scheduler and priority inheritance scheduling Counters and clocks Mutex, semaphores, condition variable, message box,39,eCos,Hardware Abstraction Layer,Architecture HAL abstracts basic CPU, including: i
30、nterrupt delivery context switching CPU startup and etc. Platform HAL abstracts current platform, including platform startup timer devices I/O register access interrupt control Implementation HAL abstracts properties that lie between the above, architecture variants on-chip devices The boundaries am
31、ong them blurs.,40,Kurt Keutzer,Summary on RTOS,41,VxWorks,Recall the Board Support Package,BSP = Initializing code for hardware device + device driver for peripherals BSP Developers Kit,BSP,42,Kurt Keutzer,Introduction to Device Drivers,What are device drivers? Make the attached device work. Insula
32、te the complexities involved in I/O handling.,Application,Device driver,Hardware,RTOS,43,Kurt Keutzer,Proliferation of Interfaces,New Connections USB 1394 IrDA Wireless New Models JetSend Jini HTTP / HTML / XML / ? Distributed Objects (DCOM, CORBA),44,Kurt Keutzer,Leads to Proliferation of Device Dr
33、ivers,Courtesy - Synopsys,45,Kurt Keutzer,Device Driver Characterization,Device Drivers Functionalities initialization data access data assignment interrupt handling,46,Kurt Keutzer,Device Characterization,Block devicesfixed data block sizes devices Character devices byte-stream devices Network devi
34、ce manage local area network and wide area network interconnections,47,Kurt Keutzer,I/O Processing Characteristics,Initialization make itself known to the kernel initialize the interrupt handling optional: allocate the temporary memory for device driver initialize the hardware device Front-End Proce
35、ssing initiation of an I/O request Back-End Processing handles the completion of I/O operations,48,Kurt Keutzer,Commercial Resources,Aisys DriveWay 3DE Motorola MPC860, MC68360, MC68302, AMD E86, Philips XA, 8C651, PIC 16/17 Stenkil MakeApp Hitachi H8, SH1, SH3, SH7x, HCAN Intels ApBuilder Motorola
36、MCUnit GO DSP Code Composer TI DSPs CoWare,49,Kurt Keutzer,Aysis 3DE DriveWay Features,Extensive documentation: KB help along the way as detailed as a chip manual: traffic.ext, traffic.dwp CNFG for configuring the chip such as memory and clock. Gives warning if necessary Can generate test function C
37、an insert user code One file for each peripheral,50,Kurt Keutzer,DriveWay Design Methodology,GUI,.DLL,K.B.,Code “generator”,.DWP,Output files,Chip specific,User data,Little generation more manipulation,Manipulation of K.B.database,51,Kurt Keutzer,K.B. Database,A specific K.B. per chip family Family
38、of chips chip peripherals functional objects (timer, PWM counter) functions physicals (register setting, values, clock rate) actual code,52,Kurt Keutzer,DriveWay Builder,Add chip Add peripheral Create skeleton, link to other thins such as GUI Code reuse in adding a new chip in an existing family, e.
39、g., use code in MPC 860 for MPC 821 Easy to create infrastructure but specifics has to be written,53,Kurt Keutzer,About the code generator (1),Cut and paste K.B. database Areas where we can use automation for device driver generation: model user specification extract useful information for drivers f
40、rom HDL description of the chip MAP registers interrupt,54,Kurt Keutzer,About the code generator (2),Why is Aysis not using automation? Commercial efficiency e.g., easy to capture user specification from the GUI rather than using a model such as UML or state machine HDL code too low level, hard to e
41、xtract information,55,Kurt Keutzer,CoWare Interface Synthesis,System suggests hardware/software interface protocols Handshaking, memory mapped I/O, interrupt scheme, DMA Designer selects communication protocols & memory System synthesizes efficient device drivers and glue logic,Hardware,Glue Logic,S
42、oftware,Device Driver,56,Kurt Keutzer,HW,Interface Synthesis Example: Memory Mapped I/O,57,Kurt Keutzer,SW: Embedded Software Tools,C P U,ROM,RAM,A S I C,A S I C,RTOS,a.out,Application,software,simulator,compiler,application,source,code,debugger,U S E R,ASIC Value Proposition,20% area decrease in AS
43、IC portion25% higher performancemove to higher level - HDL description at RTL,59,Kurt Keutzer,The Importance of Code Size,Based on base 0.18m implementation plus code RAM or cache Xtensa code 10% smaller than ARM9 Thumb, 50% smaller than MIPS-Jade, ARM9 and ARC ARM9-Thumb has reduced performance RAM
44、/cache density = 8KB/mm2,Killian- Tensilica,SW Compiler Value Proposition,20% area decrease over ASIC portion,20% area decrease in RAM portion25% higher performancemove to higher level - C rather than assembler,61,Kurt Keutzer,Memory? StrongARM Processor,Compaq/Digital StrongARM,62,Kurt Keutzer,Comp
45、iler Support,BUT, few companies focused on compiler support for embedded systems: Cygnus = RedHat Tartan = TIGreen HillsWhy? Bad buying behaviors few seats, low ASPs,63,Kurt Keutzer,Current Status on Compiler Support,Adequate compiler and debugger support in breadth and quality for embedded micropro
46、cessors/microcontrollers ARM MIPS Power PC Mot family From Cygnus/RedHat Manufacturer Green Hills DSPs still poorly supported Tartan acquired by Texas Instruments WHY? NO support for growing generation of special purpose processors: TMS320C80 IXP1200,64,Kurt Keutzer,Recall: Architectural Features of
47、 DSPs,Data path configured for DSP Fixed-point arithmetic MAC- Multiply-accumulate Multiple memory banks and buses - Harvard Architecture Multiple data memories Specialized addressing modes Bit-reversed addressingCircular buffers Specialized instruction set and execution control Zero-overhead loops
48、Support for MAC Specialized peripherals for DSP,65,Kurt Keutzer,Example: IXP1200,SDRAM (up to 256 MB),SRAM (up to 8 MB),Boot ROM (up to 8 MB),Peripherals,Ethernet MAC,ATM, T1/E1,Another IXP1200,64,64,32,FIFO Bus 66 Mhz,Host CPU (optional),PCI MAC Devices,PCI Bus 66 Mhz,32,66,Kurt Keutzer,IXP1200 Net
49、work Processor,6 micro-engines RISC engines 4 contexts/eng 24 threads total IX Bus Interface packet I/O connect IXPs scalable StrongARM less critical tasks Hash engine level 2 lookups PCI interface,67,Kurt Keutzer,Summary,Embedded software support for microcontrollers and microprocessors is broadly
50、available and of adequate quality RTOS Device drivers Compilers Debuggers Embedded software support for DSP processors is inadequate:Patchy support many parts lack support Quality poor lags hand coding by 20-100% Embedded software support for special purpose processors often non-existent Still in a build a hardware then write the software world Alternatives?,