Time |
Nick |
Message |
02:46 |
|
berick joined #evergreen |
04:38 |
|
jmurray-isl joined #evergreen |
07:34 |
|
BDorsey joined #evergreen |
07:41 |
|
collum joined #evergreen |
08:32 |
|
mmorgan joined #evergreen |
08:33 |
|
redavis joined #evergreen |
08:45 |
|
Dyrcona joined #evergreen |
08:52 |
|
kmlussier joined #evergreen |
08:54 |
Stompro |
Does the markmail site for the listserves work for anyone? http://georgialibraries.markmail.org/ I also cannot reach http://markmail.org/, connection refused error. |
08:56 |
kmlussier |
Nope. Now that you mention it, I noticed it a while back, but thought it was temporary. I forgot to check back later. |
08:57 |
kmlussier |
Good morning #evergreen! |
08:57 |
kmlussier |
@coffee [someone] |
08:57 |
* pinesol |
brews and pours a cup of Ethiopia Yirgacheffe, and sends it sliding down the bar to Stompro |
08:57 |
kmlussier |
@tea [someone] |
08:57 |
* pinesol |
brews and pours a pot of Earl Grey Decaffeinated Black Tea, and sends it sliding down the bar to troy (http://ratetea.com/tea/bigelow/earl-grey-decaf/87/) |
08:57 |
Stompro |
Hmm, markmail has been shut down. https://news.ycombinator.com/item?id=37230221 |
08:58 |
kmlussier |
85 days ago? Wow! It's a shame, I found it pretty useful. |
09:00 |
Stompro |
Yes, it was useful... and there are probably quite a few broken links to email discussion references now. |
09:01 |
kmlussier |
Oof |
09:02 |
kmlussier |
It would be nice to find an equivalent service where we could include the archives of more of our mailing lists. |
09:03 |
Dyrcona |
mailman has archives: http://list.evergreen-ils.org/pipermail/evergreen-general/ |
09:04 |
Dyrcona |
But yes, markmail was useful for a lot of mailing lists. |
09:07 |
kmlussier |
Can you search in mailman? |
09:10 |
Dyrcona |
Nope. Don't think so, unless there's a search extension for pipermail that we don't have/I'm not aware of. |
09:11 |
Dyrcona |
oops... Hit Ctrl-l in the wrong window. I just cleared my IRC scrollback. |
09:12 |
Dyrcona |
You can try using site: with DDG or Google, but it didn't work for me. I also may have been too specific with the site URL. |
09:12 |
|
adam_reid joined #evergreen |
09:14 |
Dyrcona |
Looks like FreeBSD also has been hit by the dissolution of MarkMail. Their mailing list FAQ hasn't been updated since February and it recommends using MarkMail. |
09:15 |
Dyrcona |
Search would be a good feature for pipermail. I wonder why no one has done it? |
09:35 |
|
dguarrac joined #evergreen |
09:54 |
|
kmlussier left #evergreen |
10:02 |
|
terranm joined #evergreen |
10:06 |
|
sandbergja joined #evergreen |
10:14 |
Dyrcona |
Waits with bated breath for a program to crash with an exception that he know is coming. If it doesn't then something else is wrong... |
10:17 |
Dyrcona |
I suspect that my program is too naive and it is deadlocked on I/O. |
10:30 |
Dyrcona |
Ok. Seems like my problem is mixing java.nio.channels and InputStream. I'll give good ol' PipeInputStream a shot. It will likely synchronize better. |
10:42 |
Dyrcona |
Nope. Looks like it still hangs. |
10:54 |
|
sandbergja joined #evergreen |
10:59 |
Dyrcona |
Right. The Pipe is still not synchronized. It gets to point where MarcXmlReader should read from the InputStream and hangs. |
11:01 |
Dyrcona |
So, I think I'll just dump the records with marc_export and write the Java program to read the file. That would be a lot more simple (and faster to implement) at this point. |
11:03 |
|
briank joined #evergreen |
11:18 |
Dyrcona |
Right. The reason I wanted to pull the records from the database and pipe them to MarcReader was to get the database ids of the bad records in the log output..... |
11:19 |
Dyrcona |
When the reader throws an exception, I can't get the 001 or 901$c out of the record. |
11:19 |
|
smayo joined #evergreen |
11:20 |
jeff |
why are you trying to parse a potentially problematic bunch of XML when the id column is right there? :-) |
11:21 |
jeff |
oh, never mind. I think that's exactly what you said you were doing, and I misread. |
11:24 |
Dyrcona |
Yeah. We're sending records to someone parsing them with MARC4J. I'm trying to implement a program to find records that MARC4J doesn't like and then output a spreadsheet of the errors for our catalogers. |
11:24 |
Dyrcona |
I may go back to java.nio.Pipe. I had that sort of working, but when I added InputStream to the mix, the program would hang. |
11:26 |
Dyrcona |
I know I'm getting I/O deadlock, and I'm using classes that are recommend for use with different threads in a single thread. Maybe if I spin off a thread for the MARC reader, but then I need to also get the database Id in that thread somehow..... |
11:26 |
Dyrcona |
There's too much computer science in Java. It's definitely not a hacker's language. |
11:27 |
|
kmlussier joined #evergreen |
11:28 |
Dyrcona |
I am flushing the OutputStream before trying to read from the other end of the pipe.... Pipes in C are so much easier. Well, maybe I'm more familiar with pipes in C. :) |
11:32 |
Dyrcona |
What happens if I don't flush it? I doubt that will improve things, but why not? |
11:41 |
|
jihpringle joined #evergreen |
11:46 |
Dyrcona |
MARC4J needs a way to construct a record from a blob. |
12:09 |
Dyrcona |
Java is a lousy programming language/environment. You can't do anything useful without 12 layers of abstraction, threads, and a cherry on top. |
12:10 |
berick |
Dyrcona: watcha working on? |
12:11 |
Dyrcona |
I explained it earlier. Around 11:18 EST. |
12:12 |
berick |
hm, i meant more generally |
12:12 |
Dyrcona |
That's at 11:24 EST. :) |
12:13 |
Dyrcona |
I've already done this for a set of records that evergreen-universe-rs doesn't like. |
12:13 |
berick |
oh, gotcha |
12:15 |
Dyrcona |
I think I'll put this down for now and work on a program to convert some marcxml to binary marc to see if CPAN's RT will let me upload that. I get a 403 when I try to upload the marcxml examples from the Rust test. |
12:15 |
Dyrcona |
"That should only take half an hour," he said knowing it was very likely to be a lie. |
12:16 |
Dyrcona |
Also, mexican_coca-cola++ It tastes so much better with cane sugar than with HFCS. |
12:22 |
Dyrcona |
What? libmarc-perl does not install MARC::File and friends? I thought that it did. |
12:23 |
* Dyrcona |
grumbles about CPAN.....and that half hour will be spent just installing the tools. |
12:33 |
Dyrcona |
marcdump [options] file(s) That's useful.... |
12:34 |
Dyrcona |
And, I have to write my own. marcdump doesn't work on XML. |
12:35 |
Dyrcona |
I'm just full of complaints today, aren't I? |
12:44 |
Bmagic |
you? never! |
12:44 |
Dyrcona |
I'm installing MARC::File::XML with cpan set to local::lib, and there sure are a lot of prerequisites. |
12:50 |
Dyrcona |
Failed 11/11 test programs. 3/5 subtests failed. |
12:50 |
Dyrcona |
Right. I'll just run it on a server where this is already installed. |
12:51 |
Dyrcona |
And, I'll wipe out the stuff that CPAN installed locally. |
13:08 |
Dyrcona |
Looks like I may have to reboot. I just swapped monitors and the laptop doesn't see the new one. |
13:14 |
|
Dyrcona joined #evergreen |
13:16 |
Dyrcona |
hey! That's funny. MARC::Batch catches some of these errors: Leader must be 24 bytes long |
13:19 |
Stompro |
Dyrcona, have you looked at MARC::Lint already? |
13:33 |
Dyrcona |
Stompro: Never heard of it. |
13:33 |
Dyrcona |
Apparently, all of the software in the world has chosen this week to hate me: https://rt.cpan.org/Ticket/Display.html?id=150348&results=a5d68555ff4b4354e65ce6ec51f76634 # Read to the bottom... |
13:37 |
Dyrcona |
gmcharlt: RT on CPAN is apparently broken for uploads at the moment. I've tried 3 times to add a file of records to that ticket above. |
13:39 |
Dyrcona |
Stompro++ I'll give MARC::Lint a whirl. |
13:57 |
Dyrcona |
Stompro: It looks like MARC::Lint may help. I'm running a test program already. |
13:59 |
Stompro |
I wonder if it will be too verbose, or if you can pick out the bigger issues. I'm curious how it performs also? |
13:59 |
Dyrcona |
And, maybe not so much: is_valid_checksum: Didn't get object! at /usr/share/perl5/Business/ISBN.pm line 481, <DATA> line 244. |
13:59 |
Dyrcona |
Well, it gets totally clobbered by our data after bib id 233519. |
14:00 |
Dyrcona |
It seems to be fairly fast. I'm also feeding it via cursor fetching 100 rows at a time. |
14:02 |
Dyrcona |
So, if I wrap the check_record with an eval that might work. |
14:03 |
Stompro |
So it is trying to validate ISBNs also? Wonder if it will handle the B&T fake DVD ISBNs? |
14:05 |
Dyrcona |
Dunno. That could be what it exploded on. I'm trying again with an eval BLOCK. |
14:06 |
Dyrcona |
If it gets all the way through I'll use CSV, and output the warnings to a csv. I might output the errors to a separate one. |
14:08 |
Dyrcona |
My catalogers will be sorry that they ever asked for this. :) |
14:10 |
Dyrcona |
MARC::Lint seems to find something in nearly every record. |
14:39 |
|
terranm joined #evergreen |
14:54 |
Dyrcona |
hmm... What's the limit of rows in Excel, 32,000? I may have to split this up. |
14:55 |
|
kmlussier1 joined #evergreen |
14:56 |
Dyrcona |
I guess that has changed since the 16 bit days: 1,048,576 rows by 16,384 columns |
14:58 |
Dyrcona |
Google sheets is 10 million cells, roughly speaking. Maximum column is zzz (or 18.278). I only have 2 columns. |
15:00 |
Dyrcona |
LibreOffice Calc has a limit of 1,024 columns per row. Maximum number of rows is just over 1 billion. I hope I don't have to worry about the 32,767 characters per cell limit. |
15:01 |
Dyrcona |
So as long as we stick with Google Sheets and LibreOffice, we should be OK. I doubt that I'll hit 10 million cells. It may go over 1 million rows. |
15:01 |
Dyrcona |
Right, enough blather. I'll test the version that writes the CSV files. |
15:03 |
* Dyrcona |
tries hot swapping monitors again. If I disappear, I had to reboot. |
15:10 |
Dyrcona |
Well, looks like I have to reboot. |
15:16 |
|
Dyrcona joined #evergreen |
15:28 |
Dyrcona |
Looks like MARC::Lint uses its own eval, and the errors that it passes up to the client program are not very useful for a cataloger: "Can't locate object method ""checksum"" via package ""0316110620"" (perhaps you forgot to load ""0316110620""?) at /usr/share/perl5/Business/ISBN.pm line 484, <DATA> line 244." |
15:32 |
Dyrcona |
Stompro: Do you know about Tk::MARC::Editor and MARC::ErrorChecks? |
15:41 |
|
dluch joined #evergreen |
15:45 |
|
jihpringle joined #evergreen |
15:49 |
Stompro |
Dyrcona, nope, I haven't looked at them before. |
15:50 |
Dyrcona |
I had a quick look at MARC::Errorchecks and it seems more cumbersome and nitpicky than MARC::Lint. |
16:06 |
|
pinesol joined #evergreen |
17:08 |
|
mmorgan left #evergreen |
18:26 |
|
briank joined #evergreen |
18:28 |
|
sandbergja joined #evergreen |
18:31 |
|
kmlussier1 left #evergreen |
18:54 |
|
Rogan joined #evergreen |
23:17 |
|
Rogan joined #evergreen |