root/BADataMunger/trunk


Mode:

Legend:

Added
Modified
Copied or renamed
Rev Chgset Date Author Log Message
(edit) @1444 [1444] 11/18/08 15:13:32 thomase code and changes for converting raw directory html in well-formed xhtml …
(edit) @1427 [1427] 09/12/08 15:45:31 thomase sticking a fork in the aspect of BADataMunger that is the creation of …
(edit) @1372 [1372] 08/20/08 18:31:51 thomase config and modified source files for latest batches of maps
(edit) @1371 [1371] 08/20/08 18:22:40 thomase suppressed serialization of unlabeled features and added support for …
(edit) @1370 [1370] 08/20/08 18:21:22 thomase added parsing support for: named aqueducts; baths; boundaries; various …
(edit) @1369 [1369] 08/20/08 18:19:53 thomase Handle unitalicized postfix parentheticals when parsing names. Add notes …
(edit) @1353 [1353] 08/04/08 14:53:13 thomase config and directory source files for maps 23, 84, 85, 87, 87 inset and …
(edit) @1352 [1352] 08/04/08 14:42:54 thomase added support for the 'I' suffix on map numbers: only occurs on map 87 …
(edit) @1351 [1351] 08/04/08 14:42:24 thomase added parsing and id creation support for centuriation, dikes, numbered …
(edit) @1350 [1350] 07/26/08 13:45:47 thomase adding files associated with the 2007-07-26 BAtlas ID release
(edit) @1349 [1349] 07/26/08 13:43:12 thomase added a module-level getpipe function to speed up use
(edit) @1348 [1348] 07/26/08 13:42:34 thomase label generation support for mines and springs
(edit) @1347 [1347] 07/26/08 13:41:55 thomase better support for different types of mine tables; support for …
(edit) @1345 [1345] 07/22/08 13:38:50 thomase files supporting 2007-07-22 release
(edit) @1344 [1344] 07/22/08 13:36:47 thomase suppress coastal change; develop more nuanced handling of bad markup …
(edit) @1342 [1342] 07/21/08 16:01:18 thomase new maps
(edit) @1341 [1341] 07/21/08 16:00:57 thomase new maps
(edit) @1340 [1340] 07/21/08 16:00:00 thomase template for writing output xml file headers
(edit) @1339 [1339] 07/18/08 17:10:56 thomase prevent duplicate ids and citations; tweak citation rendering
(edit) @1338 [1338] 07/18/08 17:09:11 thomase add handling for aqueduct-group, levee, quarry, villa, villa-group
(edit) @1337 [1337] 07/18/08 17:08:16 thomase refactor basic table parsing for extensibility (use a 'column map' …
(edit) @1336 [1336] 07/13/08 14:21:44 thomase postfix Inss. Ins. and fl. onto alternate/variant names, but only in their …
(edit) @1335 [1335] 07/13/08 11:37:06 thomase looks like a functionally complete config file for this simple directory
(edit) @1334 [1334] 07/13/08 11:28:54 thomase another victim
(edit) @1333 [1333] 07/11/08 16:28:21 thomase updating docstring for current usage
(edit) @1332 [1332] 07/11/08 15:52:30 thomase this thing actually makes ids properly for all feature types (except …
(edit) @1331 [1331] 07/11/08 15:50:54 thomase cleaner code for error messages on IDless mods elements
(edit) @1330 [1330] 07/11/08 15:48:17 thomase better reporting and failure detection; handle earthworks properly; more …
(edit) @1318 [1318] 05/23/08 16:21:48 thomase intermediate support for generating batlas ids for an entire directory …
(edit) @1317 [1317] 05/23/08 16:21:42 thomase intermediate support for generating batlas ids for an entire directory …
(edit) @1316 [1316] 05/23/08 16:21:32 thomase intermediate support for generating batlas ids for an entire directory …
(edit) @1306 [1306] 05/03/08 13:37:08 thomase added header, improved docstring and added working tests; also changed …
(edit) @1305 [1305] 05/03/08 12:33:31 thomase working tests for wordstripper.py
(edit) @1304 [1304] 05/03/08 12:32:55 thomase added header
(edit) @1303 [1303] 05/03/08 12:04:30 thomase make validation benchmark explicit
(edit) @1302 [1302] 05/03/08 12:03:35 thomase no need for doctype here; etree will not serialize to output anyway
(edit) @1301 [1301] 05/03/08 11:30:46 thomase added header, docstring and tests, and changed code to use unicode-aware …
(edit) @1300 [1300] 05/02/08 19:28:50 thomase dependencies for BADataMunger
(edit) @1299 [1299] 05/02/08 17:16:57 thomase added standard headers and improved/added docstrings
(edit) @1298 [1298] 05/02/08 16:55:01 thomase working tests for etree helps
(edit) @1297 [1297] 05/02/08 16:43:35 thomase working tests for character normalization
(edit) @1296 [1296] 05/02/08 16:08:30 thomase beginnings of tests and supporting data
(edit) @1238 [1238] 02/12/08 12:07:59 thomase may included content not in the main library file
(edit) @1236 [1236] 02/12/08 10:55:33 thomase raw bibliographic library export from old map center biblio database; used …
(edit) @1215 [1215] 10/25/07 17:04:33 thomase two stylesheets whereby one of our KML point files can be rendered into a …
(edit) @1180 [1180] 10/16/07 18:03:21 thomase add all required suppression directives to keep non-native places (i.e, …
(edit) @1179 [1179] 10/16/07 18:02:08 thomase support earthworks, quarries, walls and mines
(edit) @1178 [1178] 10/16/07 18:01:20 thomase generate suppression directives for gismixer from directory files
(edit) @1177 [1177] 10/16/07 18:00:45 thomase provide a more flexible range of suppression options when evaluating …
(edit) @1159 [1159] 10/12/07 17:14:58 thomase doh
(edit) @1158 [1158] 10/12/07 17:11:00 thomase PleiadesEntity/extensions
(edit) @1157 [1157] 10/12/07 17:07:42 thomase created
(edit) @1156 [1156] 10/10/07 13:47:52 thomase config parameters for map 22
(edit) @1082 [1082] 08/27/07 14:53:00 thomase modify some logging and handle capitalization variation in feature types …
(edit) @1069 [1069] 08/23/07 17:53:39 thomase append references for feature names properly
(edit) @1059 [1059] 08/21/07 11:25:46 thomase markup subcomponents of bibliographic references/citations and attempt to …
(edit) @1058 [1058] 08/21/07 11:24:04 thomase process just the geography and dir stuff to produce frankenformat for …
(edit) @1057 [1057] 08/21/07 11:21:26 thomase add support for cleanup of xlink attributes
(edit) @1021 [1021] 08/15/07 17:59:18 thomase massive regular expression voodoo to insert tei bibliographic tagging for …
(edit) @951 [951] 08/10/07 05:12:11 thomase do just the bibliographic munging, separate from the place munging
(edit) @950 [950] 08/10/07 05:11:32 thomase placesaver.py has been using this for a while!
(edit) @949 [949] 08/10/07 05:09:12 thomase copy recordInfo nodes from 'library' mods to 'student' mods
(edit) @864 [864] 07/10/07 15:19:25 thomase minor: change warning message to info message
(edit) @863 [863] 07/10/07 14:23:11 thomase Add option via config file to suppress individual features.
(edit) @862 [862] 07/10/07 13:56:17 thomase handles the new "use case" that surfaced with Map 38: unlabeled point …
(edit) @858 [858] 06/22/07 15:01:35 thomase Handle multiple locations, types and approximation indicators per place.
(edit) @857 [857] 06/19/07 13:50:38 thomase Fix an xpath construction error in the mods mixing cascade.
(edit) @856 [856] 06/19/07 13:50:09 thomase Move control over mixing of GIS data from the pipeline level down to a new …
(edit) @847 [847] 06/18/07 16:23:12 thomase More aggressive error checking on the gis/dir mixing steps.
(edit) @846 [846] 06/18/07 16:11:53 thomase entering disambiguator numbers correctly is a good idea
(edit) @845 [845] 06/18/07 15:11:41 thomase Fixed logic bug that prevented unlocated places from getting written to …
(edit) @844 [844] 06/17/07 06:50:16 thomase Incorporate the mods mixing process (enhancing the records with data from …
(edit) @843 [843] 06/15/07 16:26:34 thomase Magically expand shorthand references to RE, NPauly, KlPauly? and PECS
(edit) @842 [842] 06/15/07 15:41:33 thomase All certainty measures, plus name-wise references.
(edit) @841 [841] 06/14/07 16:51:55 thomase properly handle unlocateds and falsae
(edit) @840 [840] 06/14/07 16:41:57 thomase Sane dirpath specification and data mixing.
(edit) @837 [837] 06/14/07 13:56:03 thomase properly write classicationSection for feature names (with an internal …
(edit) @836 [836] 06/14/07 12:09:50 thomase Save place data to xml
(edit) @827 [827] 06/13/07 15:38:37 thomase gotta have config files!
(edit) @826 [826] 06/12/07 16:54:45 thomase rudimentary and buggy mixing map data with directory data
(edit) @825 [825] 06/12/07 14:10:41 thomase Saving full place information using the Pleiades frankenformat. Partial …
(edit) @824 [824] 06/12/07 14:08:16 thomase Better xpath construction using namespaces. Copy all the nodes we need …
(edit) @823 [823] 06/12/07 14:07:16 thomase Changed namespace cleanup calls to use the generic one for BADataMunger, …
(edit) @816 [816] 05/29/07 12:38:15 thomase one transform to rule them all
(edit) @815 [815] 05/29/07 12:36:22 thomase support for the TEI namespace
(edit) @814 [814] 05/25/07 17:41:03 thomase all sorts of nifty stuff to deal with names; needs more testing
(edit) @813 [813] 05/25/07 17:40:39 thomase pick up all the details from the library copy
(edit) @812 [812] 05/23/07 13:13:23 thomase added horizontal ellipsis to the list of things that gets normalized, and …
(edit) @799 [799] 04/30/07 13:04:18 thomase Parsing tables by type and handling name variants.
(edit) @798 [798] 04/30/07 13:02:45 thomase boundary condition = strip
(edit) @796 [796] 04/25/07 17:55:57 thomase First steps in parsing the directory tables.
(edit) @795 [795] 04/25/07 17:29:30 thomase Identify the directory listing tables and organize them for further …
(edit) @794 [794] 04/25/07 16:46:15 thomase Add saving of biblio in mods format as part of the "cycle".
(edit) @793 [793] 04/25/07 16:45:43 thomase Better trapping and reporting of failure conditions when trying to match.
(edit) @792 [792] 04/25/07 16:44:58 thomase Handle the case of a directory listing that contains no abbreviation …
(edit) @791 [791] 04/25/07 13:36:49 thomase Added more checks to make title matching more robust, yet more flexible.
(edit) @790 [790] 04/25/07 13:36:26 thomase Fixed bad code that was borking html character entities for characters …
(edit) @789 [789] 04/25/07 13:35:25 thomase Fixed bad code that was borking html character entities for characters …
(edit) @788 [788] 04/24/07 16:47:30 thomase Take a modsCollection file produced through biblioextraction and enrich it …
(edit) @787 [787] 04/19/07 12:44:54 thomase Clean up namespace mess created by lxml etree.
Note: See TracRevisionLog for help on using the revision log.