Hacker's SHA-1 Documentation

Programmer's Guide (Edition 1.0)

Julius C. Duque


Copyright (C) 2002 Julius C. Duque

Permission is granted to make and distribute verbatim copies of this Programmer's Guide, provided the copyright notice and this permission notice are preserved on all copies.

Permission is granted to copy and distribute modified versions of this Programmer's Guide under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one.

Permission is granted to copy and distribute translations of this Programmer's Guide into another language, under the above conditions for modified versions, except that this permission notice may be stated in a translation approved by the Free Software Foundation.

IN NO EVENT SHALL THE AUTHOR BE LIABLE TO ANY PARTY FOR DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OF THIS PROGRAMMER'S GUIDE. FURTHERMORE, THE AUTHOR SPECIFICALLY DISCLAIMS ANY WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THIS PROGRAMMER'S GUIDE IS ON AN "AS IS" BASIS, AND THE AUTHOR HAS NO OBLIGATION TO PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS.

Acknowledgements

This program would not have been possible if not for Cecchinel Stephan, whose email address is interzone@pacwan.fr. Some lines in `sha1sum.asm' were taken from `md5sum.asm' and `rmdsum.asm', both written by Cecchinel.

I would also like to thank the ASM Utils Team and the NASM Team for producing such fine pieces of utilities.

Limitations

sha1sum has only been tested on an Intel Pentium 4 (32-bit) machine, running Linux 2.4.18. This means that the maximum file size that can be hashed is only (2147483647) bytes. Because of this, sha1sum has only been optimized to take advantage of this size limitation. It is very unlikely that sha1sum can still output correct message digests of files whose sizes exceed this limit.

`sha1sum' uses the @command{bswap} instruction, which is available only on the 486 and later Intel machines. Therefore, you need at least a 486 machine to run `sha1sum'.

Unlike `md5sum.asm' and `rmdsum.asm', `sha1sum.asm' does not read from STDIN; there must be at least one input file.

`MCONFIG' Settings

Modifying `MCONFIG' to use `SYSCALL = LIBC' produces Segmentation fault, which is otherwise absent when `SYSCALL = KERNEL'. To correct this, `STARTUP = y' must also be set.

Setting `OS = FREEBSD' or `OPTIMIZE = SPEED' or both, does not produce any error. Outputs from both settings are also the same.

Print Functions

    _mov   ecx, base_N
    _mov   ebx, regSize

  _PushStack:
    xor    edx, edx
    div    ecx
    push   edx
    dec    ebx
    test   ebx, ebx
    jz     short _PopStack
    call   _PushStack

  _PopStack:
    pop    edx
    _add   edx, '0'
    cmp    edx, byte '9'
    jbe    short _PrintChar
    _add   edx, hexFormat

  _PrintChar:
    _mov   [t], edx
    sys_write STDOUT, t, 1
    ret

The PrintNum function prints the content of a 32-bit register in any base representation. By default, PrintNum prints in hexadecimal notation (base 16).

The idea behind the workings of PrintNum is to divide the original number by the desired base representation, and then store the resulting remainder. The integer part of the resulting quotient then becomes the new dividend. The division process is repeated until the quotient becomes 0. When finished, the remainders are read in the reverse order that they were stored. The remainders represent the original number printed in the desired base.

To use PrintNum, eax should already hold the number to be printed. eax is the dividend, and ecx, which has been previously initialized with base_N, is the divisor. When eax is divided by ecx, the integer part of the resulting quotient is stored in eax, overwriting its earlier content, and the remainder is stored in edx. edx is saved by push-ing it in the stack (by calling _PushStack). The division process is repeated until eax becomes 0, at which time, the contents of the stack are pop-ped off (by calling _PopStack).

If a different representation is desired, modify the variable base_N.

Now, consider this. If the output should be in hexadecimal notation, 4 bits are needed to represent one hexadecimal digit. So, a 32-bit register can represent, at a maximum, 8 hexadecimal digits. If PrintNum must always print exactly 8 digits, the call to _PushStack must also be done exactly 8 times. In this case, ebx, which has been initialized with regSize, is used as a counter for calls to _PushStack. By default, regSize is set to 8.

The output can also be printed in either lowercase or uppercase. The variable hexFormat can be set to 0x27, if lowercase hexadecimal notation is desired, or to 0x07, if the uppercase form is desired.

Also, notice that the contents of W are no longer needed during the printing of the final message digest. That is why it is safe to use t to also act as the last element of W.

SHA-1 Round Functions

The original round functions, as specified in FIPS 180-1 (Secure Hash Standard), were replaced with their alternate forms to minimize code size. The round functions and their alternates are as follows:

is bitwise OR, is bitwise AND, is bitwise exclusive OR, and is negation. Notice that the second and fourth round functions are the same, and hence, we can code sha1sum with only three round functions.

F0:
    _mov   ebx, [edi-16]  ; same as_mov ebx, [B]
    _mov   ecx, [edi-12]  ; same as_mov ecx, [C]
    _mov   edx, [edi-8]   ; same as_mov edx, [D]
    ret

F1:
    call   F0
    xor    ecx, edx
    and    ebx, ecx
    xor    edx, ebx       ; edx now holds result
    ret

F2:
    call   F0
    xor    edx, ecx
    xor    edx, ebx       ; edx now holds result
    ret

F3:
    call   F0
    or     ebx, ecx
    and    edx, ebx
    _mov   ebx, [edi-16]  ; revive old B
    and    ecx, ebx
    or     edx, ecx       ; edx now holds result
    ret

Procedure F0 is executed first before actually executing the round functions. F0 uses the following trick. Before calling the round functions, edi should already hold the address of W. Based from the arrangement of variables in UDATASEG, E is 4 bytes behind W, D is 8 bytes behind W, C is 12 bytes behind W, and so on.

SHA-1 Transformation

SHA1_TRANSFORM does the bulk of the work in computing the message digest. As in the code used for the round functions (section SHA-1 Round Functions), the reference point is also W. The same strategy is also employed, accessing A, B, C, D, and E, solely from their offset from W. In addition, the alternate method of computing the message digest, as stated in Section 8 of FIPS 180-1, is used to further cut down on code size. However, this will likely lengthen the execution time due to the increased complexity of address computations.

Byte Orientation Issues

    call   INIT_W
    sub    edi, byte BLOCKSIZE  ; same as _mov edi, W
    _mov   ecx, BLOCKSIZE/4   ; transfer is done 4 bytes at a time
    push   eax

  BYTE_REVERSE:
    lodsd
    bswap  eax            ; store in big-endian orientation
    stosd                 ; transfer contents of buffer to W
    loop   BYTE_REVERSE
    pop    eax
    ret

SHA-1, by design, favors big-endian machines. For Intel machines, which are little-endian, it is, therefore, necessary to first reverse the byte orientation of data to be operated on, before actually doing any computation. The instruction @command{bswap} does the trick of converting big-endian to little-endian orientation, and vice versa. Note, however, that @command{bswap} is available only on the 486 and later Intel machines.


This document was generated on 18 February 2006 using texi2html 1.56k.