Reading Files with Python

:’’’’’THIS ARTICLE IS A WORK IN PROGRESS AS AT’’’’ – User:Gwynne 21:26, 01 July 2014 (UTC)

Python has a module as part of it’s standard library, [https://docs.python.org/2/library/chunk.html|”chunk”], for parsing chunked files which makes iterating over the individual chunks in MODO’s IFF based files relatively straight forward. It does have one minor issue in that it only supports chunks with a 4 byte ‘size’ field which means that it won’t natively process subchunks (which only have a 2 byte size field). This is pretty easily solved, however, by subclassing the module’s Chunk class and overriding the size field.

Opening a file

All IFF files start with a 4 character ID field (‘FORM’) so to check that we have a valid file the first thing we’ll need to do is open it and check for the right ID. Since IFF files are binary format (and big-endian) we’ll need to use python’s ‘struct’ module fairly heavily to unpack the data, prefixing the format string with ‘>’ to force big-endian mode.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
#!/usr/bin/env python

import os
import struct

# using the TurtleDinoMan-Anim scene from MODO's content
infile = r'C:\Data\Luxology\Content\Samples\Animation\TurtleDinoMan-Anim.lxo'

# open the scene file to read in binary mode
with open(infile, 'rb') as iff_file:
    # read the first 4 bytes & unpack them using struct.unpack(). Returned value
    # is a tuple & the unpacked 4 character string is it's first element. Note
    # that IFF files are Motorola (or big-endian) byte order so we need to force
    # struct.unpack() into big-endian mode by prefixing the format string with
    # '>'
    file_id = struct.unpack(">4s",  iff_file.read(4))[0]
    # if the first 4 bytes of the file don't unpack as the string 'FORM' then we
    # don't have an IFF file so exit with an error
    if file_id != 'FORM':
        raise RuntimeError('%s: Not an IFF file' % os.path.basename(infile))

That confirms that our file is an IFF file but it doesn’t actually tell us whether we have an actual MODO scene (or preset) file. For that we also need to read the form’s type ID field. We can do that at the same time as checking the IFF ID by reading in 12 bytes instead of 4 which will give us the IFF ID, the size of the file’s data and the IFF type (the IFF type is actually the first field of the file’s data block). There are currently 4 type IDs defined for MODO files:

  • LXOB - scene file

  • LXPR - (generic) preset file

  • LXPE - environment preset file

  • LXPM - mesh preset file

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
import struct

# using the TurtleDinoMan-Anim scene from MODO's content
infile = r'C:\Data\Luxology\Content\Samples\Animation\TurtleDinoMan-Anim.lxo'

# open the scene file to read in binary mode
with open(infile, 'rb') as iff_file:
    # read the first 12 bytes of the file, assigning the first 4 to the IFF file
    # ID, the second 4 as an unsigned long integer which contains the length of
    # the files data and the final 4 to the IFF form type ID
    file_id, datasize, formtype = struct.unpack(">4s1L4s",  infile.read(12))

    # if the first 4 bytes of the file don't unpack as the string 'FORM' then we
    # don't have an IFF file so exit with an error
    if file_id != 'FORM':
        raise RuntimeError('%s: Not an IFF file' % os.path.basename(infile))

    # if the form type ID isn't 'LXOB' we don't have a scene file, exit with an
    # error
    if formtype != 'LXOB':
        raise RuntimeError('%s: Not a MODO scene file' % os.path.basename(infile))
1