pyaccuwage

Author	SHA1	Message	Date
Mark Riedesel	5f4dc8b80f	add 'blank' field option to allow empty text in required fields (default: false)	2024-03-31 11:14:16 -04:00
Mark Riedesel	74b7935ced	bump version to 2024	2024-03-29 10:50:25 -04:00
Mark Riedesel	66573e4d1d	update for 2023 p1220 parsing, stupid irs	2024-03-29 10:48:04 -04:00
Mark Riedesel	86f8861da1	encode record delimiter as ascii bytes when str is passed	2022-02-06 11:06:51 -06:00
Mark Riedesel	042de7ecb0	import typing.Callable (python 3.10+)	2021-12-18 08:56:43 -05:00
Mark Riedesel	f28cd6edf2	bump version 0.2020.0	2021-09-03 07:48:24 -05:00
Mark Riedesel	0bd82e09c4	Fix StaticField + tests for StaticField and unset optional TextField	2021-09-03 05:45:01 -05:00
Mark Riedesel	558e3fd232	hopefully fix STaticField	2021-09-02 17:40:35 -05:00
Mark Riedesel	7867a52a0c	fliped args around like a simpleton	2021-01-29 16:26:26 -05:00
Mark Riedesel	bfd43b7448	release 0.2018.2	2020-06-12 14:45:08 -05:00
Mark Riedesel	1f1d3dd9bb	Merge branch 'conversion-support'	2020-06-12 13:13:28 -05:00
Mark Riedesel	431b594c1e	add pyaccuwage-convert	2020-06-12 13:10:13 -05:00
Mark Riedesel	8f86f76167	add format interchange functions, add tests, fix stuff	2020-06-12 13:07:41 -05:00
Mark Riedesel	6af5067fca	add option for record delimiter	2019-01-30 14:25:24 -06:00
Mark Riedesel	250ca8d31f	fix flubbed blank field specifier on StateTotalRecordIA	2019-01-28 13:29:14 -06:00
Mark Riedesel	7ddcfcc1c3	clean up some indent	2019-01-27 10:36:37 -06:00
Mark Riedesel	d08f1ca586	hopefully fix python 2 and 3 compatability	2019-01-27 09:30:22 -06:00
Mark Riedesel	6381f8b1ec	bump version to 2018.01	2019-01-26 16:11:24 -06:00
Mark Riedesel	7c32cb0dd3	add StateTotalRecord for Iowa	2019-01-26 11:49:31 -06:00
Mark Riedesel	5afdcd6a50	add 'permitted_benefits_health' to RT and RO records for 2017	2018-01-27 11:38:23 -06:00
Mark Riedesel	706c39f7bb	CRLFField return binary data for get_data()	2017-10-29 10:41:52 -05:00
Mark Riedesel	078273f49f	fix json encoding by encoding bytes as ascii	2017-01-07 17:00:29 -06:00
Mark Riedesel	9320c68961	use BytesIO to work with python3	2017-01-07 14:52:33 -06:00
Mark Riedesel	16bf2c41d0	run through 2to3	2017-01-07 13:58:33 -06:00
Binh Nguyen	961aedc0ae	Added very important data cleaning added TextField now cleans CR and LF from data, this is very important for not breaking everything and leaving me completely confused. Thank you, Lauren!	2014-02-01 15:10:40 -06:00
Binh Nguyen	fc04a66869	Fixed debugging output	2014-02-01 12:57:36 -06:00
Mark Riedesel	4eedab0e7c	Added default record length	2013-10-11 00:21:28 -05:00
Binh Nguyen	03ce460181	Completed JSON importer. Exported from import matches original data, must be working	2013-05-21 13:36:44 -05:00
Binh Nguyen	7f9e5dbf65	added json encoder and partially functioning json decoder	2013-05-14 13:48:48 -05:00
Binh Nguyen	b9982c3a21	added missing modeldef.py and fixed genfieldfill	2013-04-20 14:34:14 -05:00
Binh Nguyen	9bbe100929	added pyaccuwage-genfieldfill	2013-04-20 14:31:09 -05:00
Binh Nguyen	ef9f012bd2	added checkseq to scripts in setup.py	2013-04-20 13:03:09 -05:00
Binh Nguyen	6bff5da58b	pyaccuwage-checkseq now reports error lines when it encounters out-of-sequence field comments	2013-04-13 12:31:11 -05:00
Binh Nguyen	c6df6c5452	Added pyaccuwage-checkseq. Everything works so far, currently the sequence comments are returned as string tuples. Next step is to take these results, convert them to integers, and make sure they occur in the expected linear order.	2013-03-30 13:15:23 -05:00
Binh Nguyen	e8e57bb932	improved record detection, state records are now found	2013-03-26 13:23:48 -05:00
Binh Nguyen	8cf78b5336	removed blank field counter, replaced with hash digest of rowspan	2013-03-20 15:49:16 -05:00
Binh Nguyen	456c15eb1c	Merge branch 'master' of brimstone.klowner.com:pyaccuwage Conflicts: pyaccuwage/pdfextract.py	2013-03-20 15:19:31 -05:00
Binh Nguyen	47f5021a84	changing repr	2013-03-20 15:18:12 -05:00
Binh Nguyen	e0d54c8a01	merging	2013-03-20 15:15:51 -05:00
Binh Nguyen	d058e64d26	tweaking validation	2013-03-20 15:13:44 -05:00
Binh Nguyen	a1ab6b4918	Looks like 1220 form has changed since last year, work on getting changes applied in a simple manner.	2013-03-05 14:49:38 -06:00
Binh Nguyen	afc4138898	fixed automatic model generation inheretence	2013-02-19 16:06:11 -06:00
Binh Nguyen	b40e736ae0	bumping version, improving field type guessing	2013-02-19 15:55:05 -06:00
Binh Nguyen	730073dcd1	working better!	2013-02-05 15:43:04 -06:00
Binh Nguyen	e6e087ef38	Record merging seems to work now that header offsets have been corrected. There's an issue parsing p1220 on line 2570. Maybe making the parser ignore full-width lines during parsing would fix the problem, if there's some way to check the length of a row, only counting single-spaced words?	2013-01-29 15:48:32 -06:00
Binh Nguyen	6e4a975cfb	Changed the way records are found by searching for field headers and then working backwards to determine the record name. We also added the ability to "break" from reading a series of field definitions based on certain break points such as "Record Layout". There is currently an error in p1220 line 2704 which is caused by the column data starting on the 4th column "Description and Remarks". If ColumnCollectors started with the field titles, and had awareness of the column positions starting with those, it may be possible to at least read the following record fields without auto-adjusting them.	2012-12-04 16:04:08 -06:00
Binh Nguyen	8995f142e5	Merge branch 'master' of brimstone.klowner.com:pyaccuwage Conflicts: pyaccuwage/pdfextract.py	2012-12-04 14:57:20 -06:00
Binh Nguyen	6e1d02db8d	trying new header location method	2012-12-04 14:54:10 -06:00
Binh Nguyen	e9a6dc981f	Refer to previous log, but also verify that records are returning proper information prior to getting passed into the ColumnCollector. It seems like some things are getting stripped out due to blank lines or perhaps the annoying "Record Layout" pages. If we could extract the "record layout" sections, things may be simpler"	2012-11-27 16:01:00 -06:00
Binh Nguyen	31ff97db8a	Almost have things working. It seems like some of the record results are overlapping. I'm assuming this is due to missing a continue or something inside the ColumnCollector. I added a couple new IsNextRecord exceptions in response to blank rows, but this may be causing more problems than expected. Next step is probably to check the records returned, and verify that nothing is being duplicated. Some of the duplicates may be filtered out by the RecordBuilder class, or during the fields filtering in the pyaccuwage-pdfparse script (see: fields).	2012-11-20 16:05:36 -06:00

1 2 3

103 commits