When I started on the first version of SCOCA’s District File Repository (DFR) a few years ago I needed a way to have an updated list of our school districts and IRN identifiers. Ryan suggested that I use a file titled disbld.seq, a fixed length text file that is updated monthly and sent to ITCs. When I first imported the file for MySQL, I used a Windows desktop application. I also use disbld for the SCOCA State Districts Directory. a somewhat popular application on our site.
Last week I started to upgrade the Districts Directory application, mainly to automate the conversion of disbld to MySQL on a regular basis. I know some of our schools run PHP and MySQL so I thought I might post a bit on how fixed length files differ from comma-separated files.
If you’ve ever had to import a comma or TAB delimited file into MySQL, you know it’s a fairly painless process. All you really need to do is delineate the field separator (comma or TAB) and let MySQL know if the fields are enclosed with double-quotes or some other character. If you use phpMyAdmin to manage your MySQL server (and you should ) there’s an import option to make the process easier.
Fixed-length files are a bit different because fields aren’t enclosed or separated. Here’s an entry from the disbld file that I work with (scroll horizontally to view the entire entry):
S035451Springfield Intermediate MiddlMahoning 11333 Youngstown Pittsburgh RdNew Middletown OH4444287243305423624170010104837130100503
Preparing for Import
What you first need to do with a fixed-length file is count how many characters make up a field for each entry. The disbld file breaks down like this:
- 1 character that defines if the building is a District or School (D or S)
- 6 characters for the IRN number (disbld uses leading zeroes to make the length equal)
- 30 characters for District or School name
- 10 characters for county
- 30 characters for address
- 17 characters for city
- 2 characters for state
- 9 characters for zip code
- 10 characters for phone number
- 23 characters for “misc” – not used in any of my apps
All of the fields except for one can be defined in MySQL as CHAR or VARCHAR field types (depending on length). The special case is the IRN field, and here’s where thinking ahead when you code an application pays off.
IF a user wants to search for a district or school IRN number, he or she wouldn’t use a leading zero for the number, since OHIO IRNs are usually not listed this way. In the above example an individual would type in 35451 in an IRN search, not 035451. So how do we fix this when importing the file? The answer is to make the IRN field an integer (INT) instead of a CHAR field type. When MySQl imports the file, the leading zeroes will be dropped.
Remember, define the MySQL table structure first, and make sure that the length of your fields exactly match what is in the file you are importing.
Import the File into MySQL
The easiest way to import the file is to load it onto the same server where MySQL resides and use MySQL’s LOAD DATA LOCAL INFILE command. You can do this from phpMyAdmin by going to your database and clicking the SQL tab. Here’s the recipe:
load data local infile 'C:\\filedir\disbld.seq' into table disbld_db.disbld fields enclosed by '' terminated by '' lines terminated by n' (`btype`, `irn`, `name`, `county`, `address`, `city`, `state`, `zip`, `phone`, `misc`)
Two things to note are:
- fields are terminated and enclosed with two single quotes (indicating that fields are not enclosed or terminated)
- you must include the field names of your table when using this command.
Last Task
Finally, you need to add an easy unique indentifier/primary key for each entry in the table. This is usually done by adding an auto-incrementing INT field type with a size of 11. The command below will add the field at the beginning of each entry, and after the addition each entry will have its own ID number. In my case I named the field DISID
ALTER TABLE disbld ADD disid INT(11) NOT NULL AUTO_INCREMENT PRIMARY KEY FIRST
Below is a screenshot that shows the final import in phpMyAdmin (click for full size):
