培訓影片系列
JESD204B
這套由三集組成的訓練系列,主要介紹關於運用 JESD204B 序列介面標準的基礎知識與訣竅,其在傳統 LVDS 與 CMOS 介面上,提供機板面積、FPGA/ASIC 針腳數以及確定性延遲方面的改進。我們的 JESD204B ADC、DAC、時脈 IC 和開發工具可運用 JESD204B 介面,快速評估、設計和實作設計工作。立即透過此隨選訓練系列進一步了解相關資訊。
JESD204B training, part 1 of 3: Overview
簡報者
資源
Hello. My name is Jim Seton, high-speed data converter application engineer for Texas Instruments, and I will be going over the JESD204B standard. This presentation is an overview of the standard to assist engineers who may be seeing the standard for the very first time.
So the introduction of the JESD204B interface for use between data converters and logic devices has provided many advantages over previous generation LVDS and CMOS interfaces, including simplifying the layout, skew management, and providing deterministic latency. However, understanding this interface and applying it to a signal chain design may seem like a daunting task. This presentation will give an overview of the important aspects of this interface and how it can be used in the real world.
Some of the things that will be covered in this presentation are the history of the 204B, some pros and cons, some timing signals that the interface uses, the layers involved with the standard, deterministic latency, subclass 0, 1, and 2, common configuration parameters, some additional information that can be downloaded from the TI website, and a summary.
So what is 204B? Basically it was a standard, a serial interface standard, developed for data converters interfacing to logic devices, such as FPGAs or ASICs. Currently, 204B can run serial data rates up to 12.5 gig. It has a mechanism for providing deterministic latency, uses 8b/10b encoding for synchronization, clock recovery, and DC balancing.
And basically, the biggest benefit of 204B is it drastically reduces the PCB layout because the parts are now using less output since we are now dealing with serial output data as opposed to parallel LVDS-style layout.
So the standard was generated by the Joint Electronic Device Engineering Council, also known as JEDEC. It is basically referred to as a serial interface for data converters. The standard can be downloaded from the link shown here on this foil. It is a roughly 147-page document with a lot of information that sometimes can be hard to understand.
So we decided to come up with an overview presentation that basically is like a Cliff Note version of a standard that extracts the important information and allows the user to better understand the standard per se. Clicking on the previous link takes you to the main JEDEC page, as shown here. A person would just have to register, login, and would be able to download the standard.
So a quick history of the 204B. When it first came out in 2006, it was simply known as 204, basically only ran up to 3 gigabits per second, did not have any deterministic latency. 204A came out in 2008. The speed was still the same, but it did allow for multiple lanes. And then in 2011, JESD204B came out, which bumped up the speed to 12.5 gigabits per second, allowed multiple lanes, multiple lane synchronization, and added deterministic latency.
Rev C is currently in work. And when this does come out, the number of SerDes lanes is going to be bumped up to 32, and the speed will approach 28 gigabits per second.
Some of the major advantages of a 204B device is that the pin count is drastically reduced. As you can see by this slide, a CMOS device versus an LVDS device versus a JESD device, that is, a quad 16-bit converter, the total number of pins is drastically reduced. And the package size is actually smaller, too, due to the lesser pin count.
With a lesser pin count not only is the package smaller, the routing requirements are lessened. As you can see by this example here, we show an LVDS device on top, which has 32 lanes that need to be routed, versus a JESD device, which only has eight lanes.
Some of the disadvantages of 204B is that the standard is a little bit more complex than a typical LVDS interface. If you do pursue an FPGA solution, you may also be required to buy an IP license from the vendor. But if you are a large company, in a lot of cases these licenses are probably given to you for free.
Another big disadvantage is latency. Whereas LVDS data is pretty much available instantaneously, using the standard there is an 8b/10b encoding involved. There is some monitoring of the bits. There is some bit-shifting alignment to establish the link. So all of this sums up to a fixed amount of latency that could be a problem in some applications.
Troubleshooting a JESD link can also be more complicated than a typical LVDS interface. The other thing, too, is PCB material. You are probably going to be running at a much faster SerDes rate, which could require, in most cases, a more expensive material.
We will now be talking about 204 timing signals and terminology. So the main signal in 204B is what is referred to as the device clock. It is a system clock, which is distributed to all the components in the link. So if you have two ADCs and an FPGA, all three will get a device clock. In parallel with that, there is a SYSREF signal. And SYSREF is only required in subclass 1 mode, which we will talk about later.
SYSREF is a timing reference which basically is used by all devices in the link, which will synchronize what is known as the LMFC, the Local Multi-Frame Clock, which is basically internal to each component. Continuing, there is a frame clock. The frame clock coming out of the transport layer is aligned to the frame clock inside the device. Per the standard, the period of the frame clock must be identical in all the components in the link. A sample clock is used to sample data.
And then the local multi-frame clock, which we mentioned in the previous slide, is used to multi-frame boundary alignment. It consists of a K number of frames. So the basic equation is the LMFC is your frame clock divided by your K value.
The SYSREF, which we discussed in the previous slide, is related directly to the multi-frame clock. And it is basically the local multi-frame clock divided by n, where n is a positive integer, such as 1, 2, 3, 4. And in most systems, we use a relatively high n value to bring down the frequency of SYSREF.
Next slide, the SYNC signal. The SYNC is a signal that basically is used to start your link. It is transmitted from the receiver to the transmitter. In subclass 2, it is used as the phase reference for the LMFC, whereas in subclass 1 it is SYSREF. And then per the standard, the SYNC signal is an active low signal.
Some terminology now. CGS is called the Code Group Synchronization. Basically this is the initial process that occurs when the link first comes up. The requirement per the standard is that there is a minimum of five frames plus nine octets during this period. The ILAS, known as the Initial Lane Alignment Sequence, this process by which the frames and the multi-frames are aligned. This immediately follows the CGS step in the link establishment.
RBD, which is the RX Buffer Delay. This is known as the release time opportunity after LMFC boundaries have been established. Basically, you have multiple lanes that are all independent of each other, but they have different delays.
This buffer is used to absorb data on links that are faster than slower ones, which allows for all the lanes to be eventually aligned after a fixed period amount of time. The problem with this delay is it, again, is going to add latency. So the user has to be careful when selecting the link of the delay selected.
Some of the key parameters in JESD204 are the L, M, F, S, K, CS shown in this slide. L is the number of lanes per converter. M is the number of converters per device. F is the number of octets per frame. S is the number of samples per frame. K is the number of frames per multi-frame.
CS, kind of a not very popular parameter used, is control bits per conversion sample. Control bits are usually an option that can be used when you have a converter that is less than 16 bits since the standard is set up for 16 bits. So if you have a 14-bit device, that frees up two bits that be used as control bits.
This slide kind of shows an overall diagram of a JESD204B link. In this example, we have an ADC connected to an FPGA. And then here we're just trying to show that the device clock between the two parts can be different frequencies, and in most cases, they usually are. And SYSREF can be two different frequencies as well. But the clocks, SYSREF, everything has to be derived from the same source. They all have to be synchronized with respect to each other.
And basically what we're trying to show here is that the critical timing is not necessarily SYSREF to SYSREF. It is SYSREF with respect to device clock at each part. The SYSREF has to meet a proper setup and hold time because it will be sampled in by the device clock.
We will now be talking about the layers involved with the 204B standard. The layers start out as, basically looking from the left, we have parallel data coming out of the transmitter. As it enters in the layers of the standard, it starts with the transport layer, followed by a scrambler layer, a link layer, then a physical layer. And then on the receive side, the same layers are identical, just in the opposite order. And at the very end of the receive, basically you have the same 16-bit data that you sent out.
Here's a functional block diagram of basically the slide we previously showed. The only thing different is we added the clocking in this slide. As you can see from the clock source, each device is getting a device clock and the SYSREF. There is SYNC signal leaving the receiver feeding the transmitter and then the data path.
So the major function of a transport layer is to map the parallel data into octets. The standard basically runs off of 16 bits, which consist of two octets. A good example of what's actually going on is shown in this slide. This is directly out of the standard itself. The standard provides options for exactly how you can map, per se, the 16-bit data into serial data.
Most devices have options where you can use either eight lanes, four lanes, two lanes, or even one lane in some cases, and that lane configuration is done in the transport layer. As you can see here, it shows how the samples are going to be broken down and which path they're going to follow, which lane they're going to be eventually assigned to.
Some of the key parameters for the transport layer are the number of lanes per converter, the L, number of converters per device, the M, the octets per frame F, and the samples per converter S.
So here is an example of a transport layer, how it's broken down. It shows an 11-bit octal ADC converter. Basically in this case, your L is 4. We have four lanes. We have eight converters, four octets per frame, and one sample per frame. As you can see, in lane 0 we're combining two data converters, Cr0 and Cr1. Lane 1 has converters 2 and 3, lane 2, 4 and 5, so on and so on.
So as you can see, in this one entire frame, it consists of one sample for each converter. So you have a total of eight converters sampled here across four lanes. And the TTTs shown are tail bits. So since it's an 11-bit converter, we have to pad it to get 16 bits total. So there is six additional tail bits added per converter.
In this slide, we are breaking down the terms in a different visual perspective. Here the flow is from right to left, which is not typical. But you have your ADC on the right, which is feeding in an FPGA on the left. So you have your four lanes with your frames on each lane. And then it shows how a multi-frame is broken up into four individual frames. Each frame is then broken up into eight octets, and each octet is actually eight individual bits.
So what is the LMFK for this link? Basically, there's four lanes, eight converters, your F is eight, eight octets. K is four because you have four frames per multi-frame. And what is S? S is equal to 4, because in this case we have four samples per lane being transmitted.
The scrambling layer is used to randomize the data, and it spreads the spectral content to reduce spectral peaks that could cause EMI and interface problems. It is a fixed polynomial equation, as shown in the slide. Most devices have options to enable a scrambling enable or basically bypass scrambling enable. It does not add latency, which is good since most people would prefer to use it.
Next, we'll be talking about the data link layer. This is the layer that actually does the 8b/10b encoding, and it is used for the code group synchronization and the initial lane alignment and frame synchronization. It also does the link monitoring using special control symbols.
So here's an example of some of the parameters used by 8b/10b encoding. Basically, the 8-bit octets are mapped into 10 bits. This allows us to generate a more balanced 0 and 1 transition, which enables the CDR, Clock Data Recovery, technique. It also allows us to do DC balancing and enables AC coupling.
The Data Link Layer, the Link Establishment. So this slide basically shows how a link is established. Basically, the whole sequence starts with the SYNC signal being asserted low, as shown by the top line. When SYNC is asserted low, the receiver is telling the transmitter that we need a special character to be transmitted, which is known as the K28.5. This is what allows the receiver to do bit shifting to get the serial interface lined up properly.
So the transmitter is constantly going to be sending K28.5 characters when SYNC is held low. And the receiver must receive at least four consecutive K28.5s to verify that the link has been established. And in that case, it will then de-assert SYNC and send SYNC high.
Once SYNC high is received by the transmitter, it will stop sending the K28.5s and start sending special data, which will then start up the initial lane alignment sequence. Once the initial frame and lane synchronization is completed, then the transmitter will just start sending out data.
So this slide is just a little bit of a description, what we just talked about. Basically SYNC is asserted. The transmitter starts sending K28 comma characters after receiving four. SYNC gets de-asserted.
This slide then shows the initial lane synchronization sequence. So in this sequence, it basically consists of four multi-frames. After the K28 comma characters have completed, the very first symbol following that will be what is known as a start of sequence, which is a K28.0, and this will happen in the first frame.
And this will be followed by some special data. And at the end of the frame, there will be a special lane alignment character called the K28.3. So each frame within this multi-frame sequence has a special start and special end character.
The second frame consists of basically the important data, which is known as the link configuration data. So this data will consist of information such as number of lanes, number of converters, so on and so on. If the receiver is receiving information telling it how many lanes the transmitter is using and it's not lining up with, per se, he is set up for, the link will not be established, and an error will be recorded.
This slide shows the link configuration data that will be transmitted in the second multi-frame. Like mentioned earlier, there will be information such as lanes, number of converters, number of bits, which subclass, so on and so on.
The data link layer has frame alignment monitoring. This occurs by the transmitter when it receives two consecutive octets that are the same value during a normal data flow transmit. When it sees two consecutive octets that are the same, it will send out what is known as a K28.7 alignment character.
So when this data is received by the receiver, it is going to compare it to each lane to make sure that it is occurring in all of them at the same point in time. And if everything's aligned, since the receiver is storing data it's going to replace this K27 with the previous octet value, which is the data that was sent earlier. So the data is never lost.
There's also a K28.3, which is a multi-frame alignment character. So if the receiver detects that there is a misalignment, you can set it up to either ignore it, maybe wait till one or two or three of these happen, or you can tell it to-- basically this link is messed up. We need to reissue SYNC to reestablish the link.
Error Reporting. Below is some of the standard errors that could occur. Especially with 8b/10b, there could be a disparity error, a not-in-the-table, control character in the wrong position. Most devices will have some type of method for storing these errors, and they'll also have some type of method of how to process these errors. In some cases, they may ignore them. Other cases, they may reestablish the link.
In this next slide, we'll be talking a little bit about the physical layer. This is actually the output of the transmitter per se, and the input of the receiver. Shown below are a table that has directly been brought in from the standard showing the speed rate, the volt-- differential output voltage, rise-fall time, and bit error, et cetera. As you can see, as the SurDes rate increase, the differential output voltage.
Shown here is the physical layer information directly brought in from the JESD standard. Basically shows the speed range of the different standards, the original A and B, differential output voltage, bit error rate, and so on. As the data rate increased, you can see that the differential output voltage decreased.
We will now be talking about deterministic latency. Basically, what is deterministic latency? Basically, when these links come up, whether it's power up or from a reset, the user wants to make sure that your output data is going to be valid at the same time with respect to when it was sampled every time across lanes, across parts, et cetera.
This was not an option with 204 and 204A. This is one of the major features of 204B. It allows for synchronous sampling, multi-channel phase array alignment, gain control loop stability, very important feature.
So within the 204B standard, there is three subclasses-- 0, 1, and 2. Subclass 0 is basically a backward-compatible option that basically allows this standard to work the same as 204A that did not have deterministic latency. Subclass 1 does have deterministic latency and uses the SYSREF with strict timing requirements. And then there's subclass 2 that does not use a SYSREF. It uses the SYNC for timing, and it also has tight timing requirements.
So the deterministic latency is achieved with these architect features. SYSREF is used for your master synchronization plan. Your LMFC provides a low-frequency reference for each of the devices. And the receiver has what is known as an elastic buffer that absorbs any link delays that may be present.
So some of the general requirements of deterministic latency is that your elastic buffer must be large enough to absorb the delay that could be associated with your link. So according to the standard, the LMFC period must be longer than your longest link delay. The K parameter comes into play here, as you can see by this equation. K has to be between 1 and 32. And F times K has to be between 17 and 1,024. In this case, F is number of octets per sample.
Some of the general requirements of the deterministic latency are shown in this slide. Your elastic buffer must be large enough. Your LMFC period must be longer than your longest link delay, and your receiver buffers on all lanes must release at the proper time.
If they release too early or too late, your lanes will no longer be synchronized. Key thing to remember here out of the standard is your K value has to be between 1 and 32, and K times F must be between 17 and 1,024, where F is the number of octets per frame.
This slide is a view directly from the standard. It is showing a link that is being brought up. The top portion is your transmitter. The bottom portion is your receiver. It is showing your SYSREF, your SYNC, your LMFC, and a couple of data lines. What happens in the very beginning is SYSREF pulse is issued. SYSREF is then used to generate-- to basically synchronize your LMFC between your transmitter and receiver.
After the two links-- basically after the two parts are synchronized, your SYNC signal is going to be asserted. As you can see, your K comma characters are coming out. And looking down below, you're going to see that we have an early lane and a late lane, as you can see. One lane is going to establish CGS before another, and that is shown by the transition between the K and the R character.
Once all lanes have established CGS, the SYNC signal will be de-asserted. And then during this time, though, each lane is buffering data. As you can see, the earliest lane is going to need to buffer more than the late [INAUDIBLE] lane. So the lanes are buffering data and waiting for the next rising edge of the LMFC period. When that occurs, that is when the data will then be released by each individual lane.
So this allows for the lanes to be synchronized, and it also allows for the data to be deterministic. So now that we know when that data is going to be valid, every time you power cycle or do a reset that data is going to appear at the same point in time.
This next slide shows the requirements of deterministic latency in the subclass 1 mode. Here we're showing basically the setup and hold time required of SYSREF with respect to the device clock. What is nice is TI makes a very nice clock device known as the LMK04828, which provides pairs of these, up to seven pairs of SYSREF and device clocks. And it actually has features to allow you to adjust the individual SYSREF on each pair so you can actually do some fine tweaking if you are having trouble meeting setup and hold time downstream.
SYSREF does not always have to be running. Once the driver and the transmitter and receiver receive it and their LMFC is locked to it, SYSREF can be turned off. And in most cases, we do turn it off, especially with the ADC. Here's just some examples shown-- a periodic SYSREF, a gapped-period, or a one-shot.
Some of our earlier ADCs were very sensitive to the SYSREF signal continuously running. As you can see in this shot, some spurs are showing up that were directly related to SYSREF. After SYSREF has been turned off, you can see that the spurs have gone away. The only problem with turning off SYSREF is that if you ever do lose your link, you will have to turn it back on or you will not be able to establish the link.
In this next section, we will be talking about subclass 0, 1, and 2 of the standard. The three subclasses have been defined as shown below. Basically subclass 0 is backward-compatible to the JESD204A. There is no defined method for local multi-frame clock alignment. It uses SYNC signal to start the link.
Subclass 1 is deterministic latency supported. And it uses SYSREF to align the local multi-frame clocks, and it also use SYNC to initiate the lane alignment sequence. And subclass 2 also supports deterministic latency, but does not use SYSREF. It only uses the SYNC signal to both start the link and to do multi-frame clock alignment.
JESD204B subclass 0 is not very common. Again, it just allows the standard to be backward-compatible with 204A. There is no support for deterministic latency. And this slide shows a quick timing configuration for subclass 0. Basically, it uses a frame clock between the two parts, a SYNC, and then the data.
204B subclass 1 is the most popular subclass, as this is what is used by most customers to achieve deterministic latency. And it requires the SYSREF signal, and it also uses SYNC to start your synchronization. But SYNC is not a critical signal in this subclass. That's a key point to remember because if you think about using subclass 2, then SYNC becomes a key critical timing signal.
And in this diagram here, the clocking scheme is shown for subclass 1. We have a clock generator device shown in the top. It is generating the SYSREF and device clock for all three components shown in this example. A very key thing to remember is that if you have multiple ADCs or DACS, the device clock and SYSREF going to all of them need to be matched as close as possible.
The SYSREF and device clock going to the FPGA need to be matched to themselves, but not matched to the signals going to the ADCs or DAC. So that makes the routing a little bit easier on the user. In this case here, it shows two ADCs using the same SYNC signal coming out of the logic device.
The 204B subclass 2 uses the SYNC signal for synchronization and deterministic latency. And since SYNC is used for the latency signal, the timing out of it is very critical. Basically, that means that the SYNC signal going to all devices, both FPGA, ADC, DAC, must arrive at exactly the same time as close as possible. Due to this requirement, it is highly recommended not to use subclass 2 if you are sampling greater than 500 megasamples per second.
This slide shows a quick clocking scheme used by subclass 2. As you can see, the clock generator is only sending a device clock to all the devices, and a SYNC signal is being shared to the two ADCs in this example. So in this case, the device clocks all must be matched length going to all three parts.
In this section, we will be talking about key configuration parameters of the JESD204B standard. Some of the key configuration parameters that we've mentioned a little bit earlier-- the L, the M, the M, F, and S. Fs is your converter sample rate, K, number of frames per multi-frame, your LMFC. Serial line rate is your SerDes speed.
So basically, your serial line rate is your sample rate times 10 times F, where 10 is due to the 8b/10b encoding. And then your local multi-frame clock is basically your sample rate divided by your K factor.
In this slide, we are showing an example of some real numbers used by a real JESD part. In this case, it's the ADS42JB49 that Texas Instruments came out with several years back. In this example, we're going to be setting up our LMFS as a 4211 parameter, as shown below. So this device actually has four lanes per device. It has two converters. And in this mode, we're going to send out one octet per frame per lane.
So if we are sampling this device at its max rate, which is 250 megahertz, and we're going to pick a K equal to 20, since K has to be greater than 17 and our F is equal to 1, we have to set K-- we don't have much choice in this case. K has to be at least 20 to satisfy that equation.
So our lane rate is going to be our sample rate times 10 times F, which comes out to be 2.5 gigabits per second. So the internal local multi-frame clock's going to be running at the Fs divided by 20, which is 12.5 megahertz. So to determine how fast your SYSREF needs to be, basically you take your LMFC and divide it by n, where n is any positive integer. So the fastest you can run SYSREF in this case would be 12.5 megahertz, or you can run it much slower-- 6.2, 3.125, et cetera, et cetera.
In most data sheets, you will see these parameters listed in tables, as shown in this example in the slide. On the top right, there is an LMFS table directly out of the ADS42JB69 data sheet showing you the two modes that are available with this part. This part only can run in two modes, and one involves four lanes, one involves two lanes.
And as you can see, the sampling rate is reduced when only running two lanes due to the fact that the SerDes rate on this device maxes out roughly at 3.2 gig. So it's not really the sample rate of the ADC per se, but it's actually the SerDes rate in this case.
Below that top table is another table out of the data sheet showing how the lane assignments are used to for these two modes. On the left, we have the four-lane mode showing the four lanes-- DA0, 1, DB0, and DB1. And it is showing how the octets are mapped per lane. So DA0 in this example is containing the upper eight bits of a sample, whereas DA1 is containing the lower eight bits of the sample for converter A, and the same for channel B.
In the two-lane mode, you can see that both the upper and lower bits are combined on one lane. So since the two octets are in one lane, your parameter for F is a 2 in this case, instead of 1 in the four-lane mode.
In this slide, we are showing a different TI converter, the ADC12J40000, just to show you that from part to part the LMF can vary by a lot. In this case, we are showing an extreme case where the LMFS is 8885, where we have eight lanes running. And we have eight internal converters that are basically being used to generate one output. There's eight octets per frame, and we are going to get a total of five samples per lane in this case.
What's unique about this case, too, is we are sharing samples across octets. If you look at lane 0, that's zero samples. 0 is actually 12 bits. So you're using the octet 0 to get your lower eight bits and then octet 1 to get your upper four bits. But the other four bits of octet 1 are also being used by the sample 8.
So this is a way of compacting the data across a lane without adding a whole lot of tail bits, which would reduce your bandwidth. And in this case, as you can see, when you get to the very end, there is only one leftover tail bit that needs to be added to generate a total of 16 bits in octets 6 and 7.
In this slide, the configuration parameters are shown for a TI DAC38J84. As can be seen here, this DAC has many options for receiving data across the JESD link. Starting to the left, the LMF of eight lanes is shown, where you can receive up to eight lanes of data with four converters and one octet per frame, followed by a 4-4-2 case, where you're only using four lanes for the converters, then followed by only two lanes for the four converters. And in the last one, we're only actually using one lane for all the data for the four converters.
Some other key configuration parameters are shown in this slide. And this is basically an example of a DAC, or it could be an example of an FPGA in the receive mode. The buffer size, which determines the elastic buffer size for our DACs, this is going to be fixed by the silicon. But in an FPGA, and this is usually programmable, the RBD, which is your Receiver Buffer Delay, which can be adjustable, your scrambling enable/disable.
And then most of them will have a SYNC request enable, basically a register that enables various resynchronization triggers. So if certain errors are detected, the firmware solver can write to this particular register and say, we need to issue SYNC to reestablish the link.
Next section, we'll be talking about additional information available with the JESD204B standard. If you go to TI site, www.ti.com, and then select High Speed ADC, then JESD204B Interface, it will take you to this website shown here in the slide. And from this slide, you can download application notes, videos. You can access data sheets of all the JESD, ADCs, and DACs currently available by TI.
This next slide is links to many blogs that have been created to help the customer bring up JESD204B and understand 204B, so on and so on. You can also go to FPGA vendors, such as Altera and Xilinx, and look at their JESD204B links, which we have shown here below. Both vendors have done sample designs using TI data converters that have firmware that can be downloaded to customers to help them get up and started quickly.
Summary. The JESD204 is a standard serial data interface for data converters. JESD204B subclasses offer three implementation variations. Your transport layer defines data framing into serial lanes. Your link layer defines encoding, synchronization, and data monitoring. Your physical layer defines the electrical and timing performance. Deterministic latency is achieved with subclass 1 and 2 and is required for known/constant latency through the link.