1、The overview of Networking Technology & New Generation Processors,Boxuan Gu Chi Chau CS-521 2-5-2004,Part 1 Networking Technology,The lecture consists of two parts,Network Architecture Ethernet technology,Network Architecure-OSI reference model,OSI,The OSI model provides a conceptual framework for c
2、ommunication between computers, but the model itself is not a method of communication. Actual communication is made possible by using communication protocols. In the context of data networking, a protocol is a formal set of rules and conventions that governs how computers exchange information over a
3、 network medium. A protocol implements the functions of one or more of the OSI layers.,OSI-Interaction,OSI-Encapsulation,TCP/IP,TCP/IP-IP,The Internet Protocol (IP) is a network-layer (Layer 3) protocol that contains addressing information and some control information that enables packets to be rout
4、ed. IP has two primary responsibilities: providing connectionless best-effort delivery of datagrams,IP Packet Format,IP address format,IP address,TCP/IP-TCP Transmission Control Protocol,The TCP provides reliable transmission of data in an IP environment. TCP corresponds to the transport layer (Laye
5、r 4) of the OSI reference model. Among the services TCP provides are stream data transfer, reliability, efficient flow control, full-duplex operation, and multiplexing. TCP offers reliability by providing connection-oriented, end-to-end reliable packet delivery through an internetwork.,TCP/IP-UDP Us
6、er Datagram Protocol,The User Datagram Protocol (UDP) is a connectionless transport-layer protocol (Layer 4) that belongs to the Internet protocol family. UDP is basically an interface between IP and upper-layer processes. UDP protocol ports distinguish multiple applications running on a single devi
7、ce from one another.,UDP-packet header,IPV6,Disadvantage of IP v4: 32 bits address is limited Routing is not efficient Bad support for mobile device Security needs grow,4bits version,8bits traffic class,20 bits flow label,16 bits payload length,8 bits next header,8 bits hop limit,128 bits source add
8、ress,128 bits destination address,IPv6 Packet Header Format,IPV6,Version Number: The version is a 4-bit field as in IPv4. The field contains the number 6 for IPv6, instead of the number 4 for IPv4. Traffic Class: The Traffic Class field is an 8-bit field similar to the type of service (ToS) field in
9、 IPv4. The Traffic Class field tags the packet with a traffic class that can be used in Differentiated Services. The functionalities are the same in IPv4 and IPv6.,IPv6,Flow Label: The Flow Label field can be used to tag packets of a specific flow to differentiate the packets at the network layer. H
10、ence, the Flow Label field enables identification of a flow and per-flow processing by the routers in the path. Payload Length: Similar to the Total Length field in IPv4, the Payload Length field indicates the total length of the data portion of the packet.,IPV6,Next Header: Similar to the Protocol
11、field in the IPv4 packet header, the value of the Next Header field in IPv6 determines the type of information following the basic IPv6 header. Hop Limit: Similar to the Time to Live field in the IPv4 packet header, the value of the Hop Limit field specifies the maximum number of routers (hops) that
12、 an IPv6 packet can pass through before the packet is considered invalid.,IPV6,Source Address: The IPv6 source address field is similar to the Source Address field in the IPv4 packet header, except that the field contains a 128-bit source address for IPv6 instead of a 32-bit source address for IPv4.
13、 Destination Address: The IPv6 destination address field is similar to the Destination Address field in the IPv4 packet header, except that the field contains a 128-bit destination address for IPv6 instead of a 32-bit destination address for IPv4.,IPv6-extension header,IPv6-extension header,Hop-by-H
14、op Options header. Destination Options header. Routing header. Fragment header. Authentication header and Encapsulating Security Payload header Upper-Layer header.,IPv6-Addressing scheme,IPv6 uses 16-bit hexadecimal number fields separated by colons (:) to represent the 128-bit addressing format mak
15、ing the address. 2031:0000:130F:0000:0000:09C0:876A:130B.,IPv6-Addressing scheme,IPv6 addresses consist of a prefix and a local part (like in IPv4) - Example: 3FFE:400:280:0:0:0:0:1/48 here the first 48 bits a fixed (prefix) and the other 80 bits will be assigned in the local subnet,IPv6-Addressing
16、scheme,In IPv6, there 3 types of addresses:,1. Unicast 2. Multicast 3. Anycast (new in IPv6),IPv6-Addressing scheme -unicast,IPv6-Addressing scheme -Multicast,IPv6-Addressing scheme -Anycast,Packets sent to an anycast address or list of addresses are delivered to the nearest interface identified by
17、that address. Anycast is a communication between a single sender and a list of addresses,Part 2: Ethernet,Ethernet,Ethernet MAC Data Frame Format,Ethernet-10gigabit Ethernet,10 Gigabit Ethernet is Ethernet. 10 Gigabit Ethernet uses the IEEE 802.3 Ethernet media access control (MAC) protocol, the IEE
18、E 802.3 Ethernet frame format, and the IEEE 802.3 frame size. 10 Gigabit Ethernet is full duplex.,Ethernet-10gigabit Ethernet Technology and Standard,The IEEE 802.3ae 10 Gigabit Ethernet Task Force was chartered with developing the 10 Gigabit Ethernet Standard. This group is a subcommittee of the la
19、rger 802.3 Ethernet Working Group. In contrast to previous Ethernet standards, 10 Gigabit Ethernet targets three application spaces: the LANs, MANs, and WANs.,Cont.,Gigabit Ethernet is no longer a shared domain, half-duplex technology. Because there are no packet collisions in a full-duplex link, th
20、e link distances are determined by optics and not by the diameter of an Ethernet collision domain. 10 Gigabit Ethernet will also be a full-duplex, switched technology, maintaining compatibility with the 802.3 Ethernet MAC protocol and the Ethernet frame format.,Cont.,10 gigabit ethernet Layer 1: Phy
21、sical Layer Devices,Contained within the PHY are several sublayers that perform these functions, including the physical coding sublayer (PCS) and the optical transceiver or physical media dependent (PMD) sublayer for fiber media. The PCS is made up of coding (for example, 8b/10b) and serializer or m
22、ultiplexing functions.,Cont.,10g Ethernet define two kinds of PHY: the LAN PHY the WAN PHY,WAN PHY,SONET Friendly Enables use of SONET infrastructure for Layer 1 transport: SONET ADMs, DWDM Transponders, optical regenerators,Not SONET Compliant Connects to SONET access devices but not directly to SO
23、NET infrastructure,Cont.,SONET Friendly Requires some SONET features: OC-192 link speed SONET framing MinimalPath/Section/Line overheard processing,Not SONET Compliant Avoids most costly aspects of SONET: No TDM support Concatenated OC-192c only Does not require meeting SONET grid laser specificatio
24、ns, jitter requirements, stratum clocking Minimal operations, administration, maintenance, and provisioning (OAM&P),LAN PHY,10 Gigabit defines a LAN PHY that, with simple encoding, will transmit Ethernet packets on dark fiber and dark wavelengths.The LAN PHY is intended to support the existing Ether
25、net applications at ten times the bandwidth with the most cost-effective solution.,Cont.,Cont.,Both the LAN and WAN PHY will support each physical medium-dependend (PMD) sublayer and, therefore, support the same distances. These PHYs are distinguished solely by the PCS. The WAN PHY differs from the
26、LAN PHY by the inclusion of a simplified SONET framer.,Cont.,10 Gigabit Ethernet Link Distance and Media Goals,Application of 10GE,10 Gigabit in the LAN,Cont.,10 Gigabit Ethernet Metropolitan Network,Part 2 AMD & Intel,Latest Desktop & Server Processors,AMD Desktop: AMD Athlon 64 FX, AMD Athlon 64 S
27、erver: AMD Opteron,Intel Desktop: Intel Pentium 4 w/ HT, Intel Pentium 4 Extreme Edition Server: Intel Itanium 2, Xeon,Desktop Processor Pricing,AMD Athlon 64 FX-51 $733 AMD Athlon 64 3400+ $417 AMD Athlon 64 3200+ $278 AMD Athlon 64 3000+ $218,Intel Pentium 4 Extreme Edition 3.4Ghz $999 Intel Penti
28、um 4 3.4Ghz w/ HT $424 Intel Pentium 4 3.2 Ghz (Prescott) w/ HT $417,Processor Timeline,Traditional Intel roadmap,Intel historically would move to a smaller process, double the cache, increase clock speeds It was true until first generation of Pentium 4 and when AMD was still struggling It is not th
29、e case for Prescott,Intel Pentium 4 (Prescott),Intel launched Pentium 4 Prescott on February 2nd Not P5 just 3rd generation of P4 Intel CEO Paul Otinelli discuss about 64-bit extension on Prescott With enough cooler Prescott can overclock to 5Ghz,P4 Prescott,New Changes,Prescott use 90 nm process in
30、stead of 130 nm process Double the L2 cache to 1 MB Expand L1 data cache to 16 KB to improve AGUs (address generation units) Add 13 new instructions aka SSE3 Extend pipeline from 20 to 31 stages Process and die size drop Increasing scheduler queue size Add a dedicated integer multiplier A new shifte
31、r/rotator logic block is replace in ALUs,SSE3,After great success with the P4 SSE2 instruction set (144 instructions) , SSE3 added 13 more to make programmers life easier fisttp: fp to int conversion addsubps, addsubpd, movsldup, movshdup, movddup: complex arithmetic lddqu: video encoding haddps, hs
32、ubps, haddpd, hsubpd: graphics (SIMD FP / AOS) monitor, mwait: thread synchronization,31 Pipeline Stages,Hyper-Threading Technology,Could increase performance up to 40% HT enables multi-threaded software to execute threads in parallel. It split instructions into multiple streams so that multiple pro
33、cessors could work on it. The problem is not many software is taking advantage of HT. HT is big in graphic arena ex: Adobe taking big advantage of HT,Prescott Problems,90 nm process not yet mature unlike 130 nm 90 nm process has heat and power problem Hold back 3.4E Ghz Intent to produce limited edi
34、tion SSE3 will be useful down the road, but todays software is not ready for it 31 stages pipeline would slow perfermance with wrong prediction,Should you get Prescott?,The real strength of Prescott is in its Hyper-Threading performance Great for multitasking Some applications Prescott beat Extreme
35、Edition in multitasking,Pentium 4 Extreme Edition,Intel top of the line desktop processor “Xeon” processor with P4 Extreme Edition label It is more like “Emergency Edition” rather than “Extreme Edition” to repose AMD 64 Optional 2 MB L3 cache,Intel Roadmap,AMD 64,AMD 64 building a bridge from the 32
36、 to 64-bit world Provide great performance without parallel Simultaneous 32 and 64 bit computing More physical address 1 TB not limited to 4GB Applications can use up to 4GB instead of 2GB Worry-Free on memory A lot less swapping to virtual memory A single architecture designed fit all,AMD Athlon 64
37、 & 64 FX,Athlon 64 is 754-pin Athlon 64 FX is 940-pin,New Changes,1 MB L2 cache Integrated memory controller HyperTransport channel Less power need New AMD Core Double the registers Integrated DDR Memory Controller Enlarge Look-Aside Buffer (TLB) Extend pipeline from 10 to 12 stages,AMD 64 Processor
38、 Architecture,Integrated Memory Controller,Provide sufficient low-latency memory bandwidth to processor core With integrated memory controller it changed the way processors access main memory It greatly increase bandwidth and reduce latencies thus speed up process Run memory controller at processor
39、speeds rather than FSB speeds Boosts performance for many applications with intensive memory use Available memory bandwidth up to 6.4GB/s with Opteron and FX and 3.2GB/s with AMD 64,AMD 64 Core,Enables simultaneous 32 and 64 bit computing There are 3 main categories in AMD 64 Core 1. 32-bit applicat
40、ions under a 32-bit OS 2. 32-bit applications under a 64-bit OS 3. 64-bit applications under a 64-bit OS Great for migration,HyperTransport,Increase overall system performance by reducing I/O bottlenecks, increasing system bandwidth and reducing system latency High-speed I/O communication Up to 6.4G
41、B/s bandwidth per link, improve interconnection with system components Up to 3 HyperTransport link (only on Opteron),SSE/SSE2 Registers,Double the number of registers Double SSE registers to improve floating point calculations,Enlarge Look-Aside Buffer (TLB),With enlarge look-aside buffer it reduce
42、transmitting between system memory and physical address,Pipeline,Extended the pipeline to 12 states from 10 to increase the clock speeds Rework the predictions,Problems,AMD partner with Nvidia, but NForce 3 chipset is not mature With nForce 3 low AGP performance bug w/ HyperTransport channel interfa
43、ce It comes up VIA is a better chipset for AMD 64,AMD 64 FX-51,“Opteron” processor with FX label Slight change on the DDR400 support (reduce validation) Major difference from Athlon 64 is 128-bit memory controller vs 64-bit Works with dual-channel Registered memory Athlon 64 works with single-channe
44、l unbuffered DDR memory,Final Word of FX-51,Athlon 64 3400+ bring the death of the FX-51 According to benchmarks from different areas Athlon 64 3400+ come very closely behind FX-51 But the price is half of FX-51 Or you can wait until FX-53 to come out,Watch Out!,AMD is talking about new Socket-939 a
45、round late this year,AMD Roadmap 754,AMD Roadmap 940,AMD Roadmap 939,Benchmarks - OpenGL,Benchmarks -,Benchmarks -,Benchmarks Business App,Result Summary,AMD is good comes to business/gaming/2D work with perspective to price/performance ratio Intel offers the best in encoding and 3D performance as w
46、ell as multitasking,Conclusion,It is very hard to compare new processors With AMD 64 lack of true 64-bit applications With Intel Prescott lack of SS3 enhance applications and “out-to-day” video driver and DirectX Hardware open the future door but not until software catch up, we wont be able to truly experience the great enhancement,Sources,Intel Corp - AMD Corp Tomss Hardware AnandTech ExtremeTech Tech Report Xbit Lab Opteronics ,