Submitting a Shapefile Update

Top  Previous  Next

 

A function named Compare with other layers simplifies, for a contributor, the task of consolidating record updates for sending to the primary database maintainer. While it's basically a compare function, it can quickly produce a ZIP update archive containing only the necessary records and files. It also writes a detailed log showing precisely what changes were made. The following tutorial assumes you're a Texas Speleological Survey data manager compiling data for some region of Texas.

 

With the TSS_Data project loaded in WallsMap, open the Layers Window and right-click on the reference shapefile that was supplied with the downloaded project. It should be an unedited snapshot of the entire database, with the release date being part of its name.

 

 

If option Compare with other layers appears in the menu, the shapefile meets the basic requirement for being the reference in a compare operation. That means it has a template component identifying the key field needed to link comparable records. This attribute field, which is normally maintained automatically and never manually edited, will be populated with unique non-blank values. For another shapefile to be compared against the reference, it must have an attribute field with the same name, type, and length as the reference's key field. In this situation, where we're preparing an update as opposed to receiving one, the compared shapefiles will have originated from the same snapshot, so their structures should exactly match that of the reference.

 

The key field used by the TSS is TSSID, a 4-digit integer prefixed with a 3-character county code, such as VAL0175. The program obtains this information when it processes the reference's template component (TSS All Types_2014-02-03.tmpshp). The template's FLDKEY directive also specifies that items in the log will be sorted by NAME (label field) within COUNTY, and also that features with coordinates north of latitude 32.2 and west of longitude -103.3 will be regarded as being unlocated. When the database is updated, all unlocated features will be assigned these coordinates.

 

Unlike the reference shapefile, the compared shapefiles can have new records with blank key values. Those records will have a CREATED timestamp tagged with the editor's name. Other records that have been revised will have a similar UPDATED timestamp that's newer than that of the corresponding record in the reference.

 

In this example, the shapefiles of interest are five visible ones grouped under folder TSS Categories. They aren't required to have these names, nor must they reside in the same disk folder. They should, however, be stored in a folder where their file links are valid, allowing the retrieval of any new files for the archive. Before initiating a comparison, make sure that these shapefiles have checked visibility boxes. That allows them to appear in the dialog that opens when you select Compare with other layers:

 

 

Initially, all visible shapefiles suitable for comparing with the reference shapefile will appear on the left side under Candidate Shapefiles. As specified in the reference's template component, they will be point shapefiles having TSSID as one of its fields. They will also have the COUNTY field that's required for key generation when the update is processed.

 

Note that we've moved the five shapefiles that we normally edit to the dialog's right side using the middle set of buttons. We've left one candidate unselected because it wasn't meant to be compared. In fact, including it with the others would cause the compare operation to detect duplicate keys and terminate with an error message. The selected set of shapefiles must collectively have records with either unique keys or empty (blank) keys.

 

Although five shapefiles have been selected, only a single shapefile will be generated by the compare function. It will lump together all the new and revised records found in the selected set and produce UPD_Reddell_2014-02-17.shp, storing it both inside and outside the ZIP file. The only reason we're comparing the reference against multiple files is that a TSS database release contains both a complete snapshot and an editable copy of it in the form of five different map layers. At least initially, the layers are distinguished by karst feature type, but since karst type is also an attribute field value, and one that can be edited, the particular source shapefile of an updated record is of no concern to the person processing the update.

 

The option Create updated reference is for use by the database maintainer. As the submitter of the update, you should leave it unchecked since you want the output to include only the records you've changed.

 

You should leave option Log outdated records checked. This will insure that even records that don't have newer UPDATED timestamps are compared with the reference's versions and listed with a caution message if differences are found. There shouldn't be any if you've chosen the correct shapefile to compare with other layers, the latter originating from the same database snapshot. In any case, outdated records will not be included in the update you're creating.

 

Values for an optional set of ignored fields (see button at dialog's top-right) won't take part in the comparison, nor will they appear in the log. Ordinarily, if there are any ignored fields, they will have been identified in the reference's template component, in which case you'll routinely leave this dialog setting unchanged.

 

In addition to choosing whether or not to create a ZIP as opposed to just a log, you may want to change the suggested Base pathname for the output files. The program constructs a suggestion based on this format:

 

<path of reference shapefile>\UPD_<editor's last name>_<current date>

 

You can store the files anywhere, but I recommend you choose either the reference's folder or a sibling folder at the same level. That's because the generated shapefile (the copy of it remaining outside the ZIP) would then have valid file links. It's convenient to be able to add it to your project and examine it along with the log file. You might discover mistakes that you, rather than the maintainer, can easily fix.

 

When you click the Compare button in the above dialog, a summary message similar to this should appear after a few seconds:

 

 

At this point you can choose "No", in which case the program will optionally open UPD_Reddell_2014-02-17.txt in your text editor. The log is a detailed summary of the compare results, listing all proposed record additions and updates. When reviewing a lengthy log you may want to first search for "*** NOTE" and "*** CAUTION" (or simply "*** ") to see what the compare function considered worthy of your attention. For example, a common mistake is adding a new record for a feature that already exists in the database, possibly in a different category or with a different location status. In the TSS database, this is suspected when a new record has the same NAME and COUNTY as that of an existing record.

 

If you respond "Yes" to the above prompt, you'll soon you'll see another prompt with a bit of additional information about the requested ZIP:

 

 

That many new linked files is unusual for a single update. In fact, in this demonstration I've used the compare function to summarize all changes made to the TSS database over a two-month period beginning Dec 1, 2013.

 

Finally, choosing OK initiates archive creation. After a few more seconds, during which the names of added files scroll by rapidly, you should see this:

 

 

The ZIP file can be sent by email, if small enough, or placed in a cloud storage folder if you're sharing one with the database maintainer. An alternative is to remove some or all of the image files from the ZIP and send them as separate email attachments, either individually or in groups. The pared-down archive will likely be small enough to email.

 

While it's important to link submitted image files to the appropriate database records, ideally placing them in the folder structure TSS has adopted, I suggest contributors not resize the files unless they're too large to be sent as attachments. When they are received, most image files are archived for the TSS collection before being conditioned for the TSS_Data project. The linked photos and map scans are optimized for screen viewing, with the file extension being changed in some cases. The body of the original file name is preserved. The benefit of resizing and changing formats is a dramatically smaller project, with images that load faster and look as good or better on screen.

 

To see how an update archive is handled on the receiving end, see Processing a Shapefile Update.

 

General Usage Note: For other applications of the compare function, the reference and compared shapefiles might well have different attribute table structures. Only the fields with matching names are examined in that case. Also, the types (fixed character vs. memo) and lengths of compared fields don't have to match exactly. Field values are reformatted as necessary for the output shapefile, which will have the same structure as the reference. Values for fields not present in the reference are discarded. Values for fields not present in the compared shapefiles are either copied from the reference or set blank when a record is being added. These features can be useful for the maintainer when processing an update, or in general when merging data from multiple shapefiles.