Bitbanging Ethernet on an Arduino

In this article, I will present a program that allows AVR-based Arduino to send and receive Ethernet packets at 10Mbit/s without the need for an external chip. This program only works on some Arduinos, as the Arduino must run at 20MHz (which may require hardware modifications), and it is still necessary to use external resistors and capacitors to adjust the voltage levels. The program also requires that the AVR chip have features that are only available on some chips (such as the atmega2560 on an Arduino Mega). This program also implements a simple ping server that works about 40% of the time. One can also transmit and receive simple UDP packets, so a possible use case would be making a sensor that transmits a value at regular intervals.

Since the Arduino is only barely fast enough, transmitting a packet requires large sections of the packet to be hard-coded. The two PWM outputs of one of the timers are used to generate the differential signal and to perform the Manchester encoding (explained below).

To receive Ethernet packets, two of the UARTs are used in SPI mode. Both RX1 and RX2 are connected to the positive wire of the differential signal via a voltage shifting circuit. Both UARTs sample the incoming signal such that one always samples the first half of the bit and the other always samples the second half. The ability to receive Ethernet signals without an Ethernet controller is particularly interesting as I have found no evidence that this has been done before.

Structure of an Ethernet packet

Ethernet signals are Manchester encoded, meaning that a 1 is encoded by a negative voltage followed by a positive voltage, and a 0 is encoded by a positive voltage followed by a negative voltage.

An Ethernet packet consists of a preamble, data, and a frame check sequence, or CRC32. The bytes of the packet are transmitted least-significant-bit first. All packets are at least 64 bytes long, so it may be necessary to insert padding bytes after the actual contents of the message.

Transmitting Ethernet packets: fixed packets

The TIMER0 hardware of the AVR chip has two PWM outputs, A and B. The timer counts up until it reaches a threshold value, at which point it rolls over to zero. This threshold is set to zero, so the timer overflows on every cycle. The outputs are configured to toggle whenever the timer counter register is zero, so both outputs toggle on every cycle. However, before transmitting the packet, the outputs are aligned such that A and B are always in the opposite state (when one is high, the other is low and vice versa). This would represent a continuous stream of all ones or all zeros. To make the phase of the output change, the program writes 255 into the timer counter register. Since it now has to count from 255 to 0, it takes an extra cycle before toggling the outputs again.

The above method is then used to transmit the entire packet.
The instructions nop out 0x26, r26 are abbreviated in the program as CHANGE, and the instructions nop nop are abbreviated in the program as NOCHANGE. Thus, the CHANGE instruction causes the output to switch from a stream of zeros to a stream of ones, and vice versa, and NOCHANGE does not. The byte 0x45 would then be encoded as: CHANGE CHANGE CHANGE CHANGE NOCHANGE NOCHANGE CHANGE CHANGE (assuming the previous bit was a zero). If the previous bit was a one, then 0x45 would be encoded as: NOCHANGE CHANGE CHANGE CHANGE NOCHANGE NOCHANGE CHANGE CHANGE

Transmitting Ethernet packets: from memory

Using a similar method, it is possible to transmit bits from the chip's register file (r0-r15), but not directly from the chip's SRAM. Note that there is a nop instruction before every out instruction except the first. Therefore, it is possible to replace the nop instruction with a sbrc instruction, where sbrc means "skip if bit in register is cleared". If the corresponding bit in the register is set, then the next instruction is not skipped, and the state of the output changes. For example, if r0 contains 0xCF, then the code: sbrc r0, 0 out 0x36, r26 sbrc r0, 1 out 0x36, r26 sbrc r0, 2 out 0x36, r26 sbrc r0, 3 out 0x36, r26 sbrc r0, 4 out 0x36, r26 sbrc r0, 5 out 0x36, r26 sbrc r0, 6 out 0x36, r26 sbrc r0, 7 out 0x36, r26 would be equivalent to: CHANGE CHANGE CHANGE CHANGE NOCHANGE NOCHANGE CHANGE CHANGE and so, 0x45 would be written to the bus. In the program, this is abbreviated as FROMREG("r0"). Depending on whether the most significant bit of the variable byte is a one or a zero, the first instructions of the fixed byte may either be CHANGE or NOCHANGE. Therefore, there are registers reserved for the first bit immediately after every block of variable data. In order to transmit more than 16 bytes of variable data, it is necessary to reload r0-r15. This can be done in place of a NOCHANGE instruction, as the lds instruction (load from RAM) also takes two clock cycles.

Computing the CRC

The Ethernet protocol states that every Ethernet packet must end with a frame check sequence, which is a 32-bit cyclic redundancy check (CRC) of the entire packet (excluding the preamble). For fixed packets, this is easy, as the CRC can be pre-computed and hard-coded into the program. For variable packets, one must compute the CRC before the packet is transmitted. In order to save program space, a separate copy of the message is not saved. Instead, the CRC program steps through the program code that contains the hard-coded and variable sections of the message. The program then simulates this section of program code by looking for specific instructions. Specifically, the CRC program looks for nop, out, lds, sbrc, and sbic (skip if bit in IO register is cleared) instructions, and determines which bit the program would transmit. This is then used to compute the CRC for the message, which is then stored in registers r16-r19. The CRC is then transmitted after the contents of the packet.

Receiving Ethernet packets

The Arduino receives Ethernet packets by using two of its UARTs (serial ports), UART1 and UART2, to sample the incoming signal. The UARTs are in SPI mode, meaning they won't look for start and stop bits and can run at 10 Mbit/s. Thus, one UART acting alone can sample the incoming signal at 10 million times per second. However, the program uses two UARTs, where one of them is one clock cycle offset from the other, so they sample the incoming signal alternately. This increases the sampling rate to 20MHz, which increases the reliability of the receiver.

When the receiver is working correctly, one UART samples the incoming signal during the first half of each bit, and the other during the second half:

In the above example, the received byte is 0x45. UART1 reads 0xBA (the complement of 0x45), and UART2 reads 0x45. The samples are also centered nicely on the highs and lows of the signal, which is often not the case. Often, some of the samples will fall on the transitions of the signal, producing undefined results. Thus, one of the UARTs will receive a valid signal, whereas the other receives a corrupted signal. To determine which (if any of the signals) is the correct one, the program uses a property of CRCs: if the CRC32 of a message is appended to that message, the CRC32 of the whole thing will always be 0xDEBB20E3. Thus, the program iterates through the message received by UART1, and incrementally adds each byte to the CRC. When the CRC32 register becomes 0xDEBB20E3, the program stops and interprets the previous bytes as the message. If this does not happen, then the program does the same thing for the message received by UART2. If neither message is valid, then the packet is ignored.

Since the received voltage levels are much lower than the input voltage levels of the Arduino, a level shifter circuit is needed. Its schematic is shown below:
This circuit adds an offset to the incoming signal, bringing it closer to the Arduino's input threshold voltage (the minimum voltage required for the input to register as high). Now, the incoming Ethernet signal is enough to be registered by the Arduino. This circuit is probably the cause of the unreliability of the receiver. If a logic IC with more appropriate threshold voltages were inserted between this circuit and the Arduino, the reliability would probably increase without significantly increasing the cost of the device.

Implementing a simple ping server

To implement a ping server, the Arduino must be able to recognize and respond to two types of packets: ARP (address resolution protocol) requests, and ICMP (internet control message protocol) echo requests (pings). When the user pings the Arduino's IP address, the computer first looks up the MAC address of the Arduino using an ARP request. The Arduino then receives the ARP request and parses out the relevant sections. If the request is addressed to the Arduino's IP, then the Arduino responds with an ARP response packet containing the Arduino's MAC address. The computer then sends ICMP echo requests to the Arduino. The Arduino receives these packets, and if they are addressed to the Arduino's MAC address, it responds with an ICMP echo response. Since an ICMP echo request by default contains 64 bytes of "payload" data, which is greater than the 16 byte limit for contiguous blocks of variable data, the data was hard-coded into the response packet under the assumption that it wouldn't change, or that if it did, it wouldn't matter.

Possible enhancements

The reliability, performance, and usefulness of this setup could likely be significantly improved by adding a single 74HC86 quad XOR gate chip to the circuit. A possible circuit would be as follows:

The only new components are the XOR gate chip and the two transformers. These transformers are in the same package and are intended to be used with Ethernet interfaces. These transformers can often be extracted from scrap electronics and are sometimes embedded in the Ethernet jack (such as in a Raspberry Pi). XOR chips (especially the surface-mount ones) are still much cheaper than a proper Ethernet controller.

This circuit would perform the Manchester encoding in hardware, so it is not necessary to implement it using software hacks. To transmit data, the EN pin would be switched high, which ensures the outputs of the XOR gates are complements. Then, using one of the Arduino's UARTs, data could be transmitted directly onto the bus, without the need for any preprocessing. This would allow the Arduino to transmit an arbitrary number of bytes from RAM. The signal is Manchester encoded by XORing the data with the clock signal. Note that it is not possible to use the SPI hardware for this, since it is not possible to transmit bytes back-to-back without any pauses in between.

The circuit also uses an XOR gate to buffer the incoming signal. This increases the reilability of the receiver by feeding logic-level signals to the inputs of the Arduino. By operating the XOR chip at a lower voltage than the Arduino, it may also be possible to further improve the reliability.

Comments

Popular posts from this blog

Controlling a ceiling fan and light with an Arduino

Pole and zero calculator

Improving and calibrating the capacitive water sensor