09:37 |
Dyrcona |
The message normally looks like: Invalid indicators "00d" forced to blanks |
09:37 |
rfrasur |
remingtron: Deepfreeze killed most of the orcs in our computers. |
09:38 |
Dyrcona |
My guess is that the program spit out the orced to blank, because the invalid indicators were 3 DEL characters, so the following 3 characters from the message were deleted. |
09:38 |
Dyrcona |
DEL characters don't belong in a MARC record, but there they are. |
09:38 |
rfrasur |
clever...well...not really, but clever that you figured it out. |
09:38 |
remingtron |
DEL characters are among the most mysterious |
09:38 |
Dyrcona |
@blame TLC |
09:38 |
pinesol_green |
Dyrcona: TLC is why we can never have nice things! |
09:38 |
rfrasur |
If there's a human inputting information, standards are suggestions. |
09:40 |
Dyrcona |
The software should never allow a DEL character to be inserted into a MARC record, and most GUI frameworks would simply delete something from input rather than propagate the DEL character to the data. |
09:40 |
Dyrcona |
I blame buggy software and therefore a programmer for this one. |
09:41 |
Dyrcona |
Dyrcona's corollary to the 2nd Law of Thermodynamics: Data in. Entropy out. |
09:42 |
rfrasur |
I dunno any libraries that use TLC. I know they exist. Just don't have personal experience with them. |
10:51 |
gmcharlt |
this cannot be left un-noted |
10:52 |
gmcharlt |
@quote add <senator> what a nice smile, dojo. why thank you, angularjs, and that's a nice outfit you're wearing. |
10:52 |
pinesol_green |
gmcharlt: The operation succeeded. Quote #68 added. |
10:53 |
Dyrcona |
I'm processing the final file MARC data file for this weekend's migration into my development server for testing. |
10:53 |
jeff |
I found some interesting (and not so interesting) tidbits in this article, its comments on site and on hn: https://coderwall.com/p/3qclqg https://news.ycombinator.com/item?id=6452960 |
10:53 |
Dyrcona |
I think I'm getting a few more messages about using the object hash as a call number than I did with the previous sample files. |
10:54 |
Dyrcona |
That means more copies without call numbers. |
12:03 |
Dyrcona |
heh. |
12:04 |
Dyrcona |
jcamins: The code that I'm using right now doesn't really try validating. |
12:05 |
Dyrcona |
It basically says, "Oh, you have a subfield code that isn't a-z or 0-9. I'm deleting the subfield. Oh, you have a data field with no subfields. I'm deleting that, too." |
12:12 |
Dyrcona |
That's the generic bits anyway. There is an additional check specific to this particular set of MARC records. |
12:13 |
Dyrcona |
I'm using a wrapper around OpenIL::Utils::Normalize::clean_marc in my application. |
12:13 |
* Dyrcona |
daemonizes himself and goes back to watching messages scroll in a console window. |
12:30 |
* Dyrcona |
forks a foreground process. |
12:30 |
Dyrcona |
Heh. Now, it also deletes "empty" subfields. |
12:30 |
Dyrcona |
I should call it "Kiss your MARC goodbye." ;) |
12:44 |
jcamins |
Dyrcona: doesn't MARC::Record already handle that? |
12:44 |
jcamins |
(subfield-less fields and data-less subfields) |
12:45 |
Dyrcona |
jcamins: I don't think so. It only seems to complain about empty data fields, not subfields. |
12:45 |
Dyrcona |
I made a patch to it yesterday to match subfield codes by regexp. |
12:45 |
jcamins |
Ooh, that's useful! |
12:45 |
Dyrcona |
Also, I'm defining empty as /^\s*$/ |
12:46 |
Dyrcona |
So all spaces or actually empty, so I'm getting things that MARC::Record would miss anyway. |
12:47 |
Dyrcona |
https://rt.cpan.org/Public/Bug/Display.html?id=88682 |
12:49 |
Dyrcona |
Line 458 could probably be improved. Maybe I'll submit an updated patch. |
12:49 |
jcamins |
Still looks useful. |
11:03 |
Dyrcona |
jeff Understood. I think the requirement is what is ridickerous. :) |
11:03 |
jeff |
aha! :-) |
11:03 |
jeff |
also, to avoid "we batch updated a bunch of copies into a new location that is not in-scope", the copy status table doesn't rely on acp.edit_time |
11:03 |
Dyrcona |
Anyway... I'll bounce back to work, and maybe start a git branch or repo for a new marc export tool. |
11:04 |
bshum |
Dyrcona++ |
11:04 |
|
zerick joined #evergreen |
11:04 |
dbs |
Dyrcona: yeah, I was thinking doing most of the export work in-database would probably be wayyyy faster |
14:42 |
jboyer-isl |
That's also nice, since more than one of the CPAN links on that module's page goes to a dead link. D: |
14:43 |
senator |
smallish to moderate amount of new EG code to take a stripe "token" (their term) and validate that with stripe and maybe then ultimately give the user final confirmation (depending on how that part works exactly; like i said, just reading it for the first time now but finding it very promising) |
14:44 |
jeff |
stripe's api and their api docs is worth review for anyone creating an api or docs. |
14:45 |
Dyrcona |
Cataloging at the circ. desk, and people wonder why MARC record are such shit. |
14:50 |
jboyer-isl |
Well, I'm really hoping to get a lot done at the hackaway re: OPAC mobile friendly-ness, but does anyone else have any interest in getting basic stripe support started if there's time/tuits? |
14:52 |
bshum |
Sigh |
14:53 |
bshum |
Sorry jboyer-isl, not you. That idea sounds worthy of being added to the list. |
09:08 |
Dyrcona |
i18n ain't easy. |
09:08 |
Dyrcona |
but it should be. |
09:08 |
paxed |
*insert meme* THE NUMBER OF I18N PATHS ... IS TOO DAMN HIGH |
09:10 |
Dyrcona |
My current joy is dealing with a collection of MARC records that are encoded in two different character sets seemingly at random. |
09:10 |
Dyrcona |
Oh, and all of the invalid MARC records in said collection. |
09:10 |
rfrasur |
yeah..."joy" seems about right. |
09:10 |
* Dyrcona |
mumbles something about proprietary ILS vendors. |
09:10 |
* rfrasur |
mumbles something about librarians in general. |
09:10 |
Dyrcona |
Invalid indicator "|" forced to blank should just never happen. |
09:11 |
|
sseng joined #evergreen |
09:11 |
rfrasur |
are you having to import MARC w/ |? |
09:12 |
Dyrcona |
I'm running a script I wrote on the binary MARC, MARC::Record and MARC::Charset take care of the rest. |
09:12 |
Dyrcona |
However, whenever MARC::Charset encounters a record that isn't MARC 8, I get a no mapping found message. |
09:12 |
Dyrcona |
The record still goes in. |
09:13 |
Dyrcona |
Just with the garbage character included. |
09:13 |
rfrasur |
yeah...I'd go with "joy" rather than the really long of explanatory descriptors. |
09:19 |
rfrasur |
You just answered the question. |
09:19 |
Dyrcona |
Yes, I'm calling them out on allowing a phrase to be entered as "indicators" on a record. |
09:19 |
rfrasur |
At least they recognize that that's not a real indicator? |
09:19 |
Dyrcona |
No, it is MARC::Record and friends that recognizes it is an invalid indicator. |
09:20 |
Dyrcona |
TLC gladly accepted and then exported that crap. |
09:20 |
rfrasur |
Was that actually entered as an indicator in the MARC? or was it a migratory mess? |
09:21 |
Dyrcona |
I'm assuming it had to be entered to appear in the export. However, I have no idea what kind of shit TLC gets up to under the hood. |
13:53 |
Dyrcona |
I've checked the fine rate and max fine and those look good. |
13:54 |
gmcharlt |
are the individual generate_fines calls showing up in the logs? |
13:55 |
Dyrcona |
gmcharlt: I didn't check the logs, but I will after I run it again this afternoon. |
13:55 |
Dyrcona |
Think I'll truncate the logs after my marc load finished. |
13:55 |
Dyrcona |
finishes. |
13:56 |
yboston |
heads up, the DIG monthly meeting will be starting at 2 PM EST |
13:57 |
* krvmga |
is here for the DIG meeting. |
13:43 |
paxed |
hm. how would i debug fieldmapper problem? seems like it's only giving me english strings, not finnish ones. |
13:44 |
paxed |
looks like http://localhost/reports/fm_IDL.xml?locale=en-US vs. locale=fi-FI work correctly. |
13:45 |
paxed |
staff client won't show fm strings in finnish though. :( |
13:47 |
Dyrcona |
gmcharlt: Are you also the maintainer of MARC::Batch? |
13:47 |
gmcharlt |
Dyrcona: yes |
13:47 |
Dyrcona |
I think I'll file an issue with a patch for a feature request in a bit. |
13:47 |
gmcharlt |
groovy |
13:56 |
gmcharlt |
Dyrcona: *quack* ;) |
13:57 |
* jeff |
grins |
13:58 |
rfrasur |
(how very strange that the ISBN for this book is only one digit off from another book that we have in our purchase queue) |
13:58 |
Dyrcona |
Although, it would be pretty easy to make MARC::Batch add the new line rather than my code setting warnings of and then querying each record. |
13:58 |
mrpeters |
tater: i think it actually works right out of the box! |
13:58 |
csharp |
hopkinsju: I'm using Fedora 19 as my primary OS at home and work right now - or were you talking about server? |
13:58 |
Dyrcona |
How very common that two totally different books from the same publisher have the same ISBN. |
14:42 |
rfrasur |
or blood soup |
14:42 |
Dyrcona |
Eggs pickled in urine. |
14:43 |
Dyrcona |
On that note, I go back to watching error messages stream over my screen. |
14:44 |
Dyrcona |
1,200 records to go and my marc load is done. |
14:46 |
rfrasur |
Dyrcona: You win with the urine pickling. |
14:46 |
* rfrasur |
still hates cottage cheese, but just a little less. |
14:49 |
mrpeters |
tater -- would you welcome a rewrite of http://esilibrary.com/~mtate/eg-stats.html with instructions for rsyslog (thanks to moodaepo) |
08:07 |
|
rsinger joined #evergreen |
08:52 |
|
Dyrcona joined #evergreen |
08:52 |
Dyrcona |
Here's a fun one! |
08:52 |
Dyrcona |
Using MARC::Charset from the repo on SourceForge on both machines. |
08:53 |
Dyrcona |
One machine processes the file of MARC records with no problems. |
08:53 |
Dyrcona |
The other blows up on record 19 067. |
08:53 |
Dyrcona |
The files are identical. |
08:54 |
Dyrcona |
no mapping found at position 17 in Phép lạ của sự tỉnh thức. at /usr/local/share/perl/5.14.2/MARC/Charset.pm line 296. |
08:54 |
Dyrcona |
It is the i with the whatever-that-is above it. |
08:55 |
Dyrcona |
Records are UTF-8 and say so in leader 09. |
08:57 |
Dyrcona |
Vietnamese, again. :( |
08:59 |
Dyrcona |
jcamins: Actually UTF-8 and say UTF-8, I converted them with yazmardump. |
08:59 |
Dyrcona |
Maybe I should just try processing the original files. |
09:00 |
jcamins |
Dyrcona: okay, with MARC, you can never be too careful. |
09:00 |
Dyrcona |
jcamins: Right, MARC is a picky bastard. |
09:00 |
jcamins |
Same version of Perl on both machines? |
09:00 |
* Dyrcona |
kicks MARC where it hurst. |
09:00 |
Dyrcona |
jcamins: Yes, but I'll double check right now. |
09:01 |
Dyrcona |
Yep. 5.14.2 with 56 registered patches, installed from packages. |
09:02 |
jcamins |
Does one of the machines have EG, and one not? |
09:02 |
Dyrcona |
Nope, both have EG. One runs it as a server, the other uses the libs as a client. |
09:03 |
Dyrcona |
Also, to be clear, I'm using a patched version of MARC::Charset that has corrections Galen made based on these records. |
09:03 |
Dyrcona |
I wonder if I'm getting the right MARC::Charset on the one machine, though it prints the "correct" version if I ask it. |
09:04 |
jcamins |
Hm. That's where I was going... I was wondering if one had dependencies installed through CPAN and one via package manager. |
09:05 |
Dyrcona |
Well, both had MARC::Charset originally installed through package manager and then installed from source. |
09:05 |
Dyrcona |
However, one might have had a package update done via CPAN and the other not. |
09:05 |
Dyrcona |
Hmm.... |
09:07 |
Dyrcona |
I'm going to upgrade everything via CPAN on the one machine and see what happens. |
09:16 |
jcamins |
Heh. That'll take a while. |
09:25 |
Dyrcona |
Well, I'm certain that I did not upgrade everything via CPAN on the machine that works, but something related must have been updated there. |
09:25 |
jcamins |
Fixed it? |
09:27 |
Dyrcona |
Ah. I think on the machine where it works, I upgraded MARC::Charset via CPAN before installing it from source. |
09:27 |
Dyrcona |
Still updating everything... |
09:34 |
Dyrcona |
Well, the upgrade finished with some errors, but nothing that looks relevant to MARC::Charset. |
09:34 |
Dyrcona |
So, I'll know in a few hours. It takes a while to parse 19 000 MARC records and do some database lookups. |
09:35 |
jcamins |
You can't split out a file with just the problem record? |
09:35 |
Dyrcona |
jcamins: Yeah, I can, but I still need to process the whole file. |
09:35 |
Dyrcona |
;) |
09:54 |
|
mtcarlson_away joined #evergreen |
09:55 |
Dyrcona |
Acorn Squash, that's it. |
09:57 |
Dyrcona |
All right. Guess I'll cut the 240$a and 500$a from that record that are causing me problems. |
09:57 |
Dyrcona |
marc-- |
09:57 |
Dyrcona |
@whocares MARC |
09:57 |
pinesol_green |
bshum, jcamins and rfrasur hate MARC |
09:57 |
Dyrcona |
@hate MARC |
09:57 |
pinesol_green |
Dyrcona: The operation succeeded. Dyrcona hates MARC. |
09:58 |
Dyrcona |
lychee rice pudding sounds delicious. |
09:58 |
Dyrcona |
I'm coming over for some of that! ;) |
11:29 |
jcamins |
Oh, so you did. |
11:29 |
Dyrcona |
At 10:28:16. |
11:30 |
jcamins |
I was busy cutting lychees. |
11:32 |
Dyrcona |
I have installed MARC::Charset from here on both computers: git://git.code.sf.net/p/marcpm/code |
11:33 |
jcamins |
What happens if you open the file with :utf8 and pass MARC::Batch the file handle? |
11:34 |
jcamins |
I have trouble believing there would be a problem with the way MARC::Batch opens the file, but... |
11:35 |
Dyrcona |
Well, guess what.... |
11:36 |
jcamins |
Are you using screen in one? |
11:37 |
Dyrcona |
Screen in both. |
11:37 |
jcamins |
Could one have been started with -U and one without? |
11:37 |
Dyrcona |
MARC::Batch has no version, so I can't really check if they are the same. |
11:38 |
Dyrcona |
I never use -U. |
11:38 |
jcamins |
Huh. |
11:38 |
jcamins |
I always use -U. |
14:20 |
jcamins |
9400 links loaded. We're blazing along! |
15:21 |
Dyrcona |
Whee! Not making as much progress as I would have liked so far today. |
15:21 |
jcamins |
Dyrcona: I have loaded the first file incorrectly about six times now. |
15:22 |
Dyrcona |
I had hoped to be able to get to patron data today, but I'm still working on issues with MARC records. |
15:23 |
Dyrcona |
Yeah, I know how that goes.... |
15:23 |
Dyrcona |
Loading files incorrectly, almost never get it quite right the first time. |
15:24 |
jcamins |
Now, I don't actually *have* to load all... ummm... 3.8GB of data, but I want to see what will happen. |
14:18 |
rfrasur |
Not sure that THAT could have been improved since the format's dependent on real estate. |
14:18 |
rfrasur |
but the durability could have. |
14:19 |
|
mcooper joined #evergreen |
14:22 |
Dyrcona |
Is there an opensrf call to export marc records with holdings? I'm using introspect and either getting segfaults or not finding anything that looks relevant. |
14:25 |
Dyrcona |
ah, looks like supercat might have what I need. |
14:26 |
jeff_ |
likely, yes. |
14:26 |
jeff_ |
depends on what you're trying to do. :-) |
14:59 |
jeff |
jboyer-isl: thanks for the info. i won't feel bad about my lack of a player. :-) |
14:59 |
rfrasur |
hmm...s/humanity's/humanities |
14:59 |
jcamins |
Dyrcona: right, not right now, but at some point. |
15:00 |
Dyrcona |
jcamins: Anyway, I'm just answering the questionnaire that the library director can't, so looking through the code of marc export gives me what I need for now. |
15:00 |
jboyer-isl |
jeff: That said, they're very neat. I grabbed one with remote and one without off of ebay years ago. |
15:01 |
|
kmlussier joined #evergreen |
15:04 |
dbs |
Dyrcona: EBSCO was happy with marc_export's output for us, when we gave it a spin. |
14:53 |
jeff_ |
Dyrcona: ah. there's another place I'm wrong. :-) |
14:53 |
bshum |
berick: Do I poke you further on that one? :) |
14:53 |
Dyrcona |
I mean I don't think U+00E1 equal á |
14:54 |
Dyrcona |
jeff_: I'm guessing you have MARC-8 or ISO-8859-1 input here. |
14:56 |
Dyrcona |
U+00E1 = á |
14:57 |
berick |
bshum: hmm, the code all lives on github now (since that's where the original code lives) https://github.com/berick/openils-mapper/tree/GIR-segments-for-copy-data |
14:58 |
Dyrcona |
The property entity is á |
15:14 |
Dyrcona |
gmcharlt++ # for bearing with me in a moment of confusion on my part |
15:15 |
gmcharlt |
Dyrcona: my pleasure; character set issues have given me a ton of headaches over the years, and I'm happy to help NOT spread the pain around |
15:16 |
|
Callender joined #evergreen |
15:17 |
Dyrcona |
gmcharlt: Speaking of which, you said a while ago that you were going to release a new version of MARC::Charset. I see 1.34 is still the latest on CPAN. |
15:18 |
Dyrcona |
Those changes that I picked from sourceforge worked for me, except the Hangul records as you noted. |
15:18 |
* dbs |
was thinking about updating Fedora packages sometime in the next few weeks |
15:18 |
gmcharlt |
Dyrcona: yep, sudden loss of tuits for some Cyrllic corrections I'm working on -- release soonish, though |
15:18 |
Dyrcona |
gmcharlt++ |
15:19 |
dbs |
gmcharlt++ |
15:20 |
berick |
bshum: i don't recall offhand if any of the EDI configuration/install is documented. (it's been a while) |
15:20 |
Dyrcona |
I find it amusing that when I tell someone we're having charset problems with MARC, they always ask, "Is it Cyrillic?" |
15:20 |
Dyrcona |
I usually have to say, "No, its Vietnamese." |
15:20 |
Dyrcona |
;) |
15:20 |
bshum |
berick: Yeah it's in the docs. I guess we need to adjust them slightly to point at the new github location. |
15:34 |
jeff |
more good news: specifying the encoding to new_from_xml also gives a heck of a speed boost, since it was transcoding to marc8 before. :P |
15:34 |
jeff |
i really thought that the defaults were properly set, and that i had verified that. |
15:41 |
jeff |
hah. went from about 600 reconds/sec to 20,000 records/sec |
15:42 |
Dyrcona |
jeff: Do you know how many times MARC::Record->new_from_xml() appears in the EG code? |
15:42 |
jeff |
no, but i'm about to find out, either from you or from ack. |
15:43 |
jeff |
new_from_xml appears 124 times |
15:43 |
jeff |
(in master) |
11:18 |
phasefx |
used to be autogen.sh would need to be run for stuff like that, but now I think there's perl code caching org unit data |
11:18 |
dbs |
Pretty much everything is "OPAC visible" in the staff client, no? |
11:18 |
phasefx |
so you may need to restart or reload apache, and/or restart OpenSRF services, for an OPAC visiblity change to get noticed. I'm not sure |
11:19 |
Dyrcona |
gmcharlt: I have a list of name authorities that every 20th to 50th record throws up a MARC::Charset error. |
11:19 |
rfrasur |
lol, I don't have the ability to restart or reload apache or OpenSRF |
11:19 |
phasefx |
dbs: yeah, I think so |
11:19 |
Dyrcona |
gmcharlt: You want me to compress that file and just send it to you? |
11:26 |
|
acoomes joined #evergreen |
11:26 |
phasefx |
was a developer itch, but I don't think it was something PINES was pushing for, so it didn't land in jspac |
11:27 |
rfrasur |
personally, I don't have any desire to do a survey through the OPAC. I think it'd cheese some people off and confuse some others. But, if that's the case, and maybe it's already addressed in later versions, that needs to be removed from the UI |
11:28 |
Dyrcona |
gmcharlt: If you have a test version of MARC::Charset that you'd like me to try, just let me know. |
11:28 |
gmcharlt |
Dyrcona: soon |
11:29 |
Dyrcona |
cool |
11:29 |
phasefx |
rfrasur: I think the only reason for us to even want to implement over just using a dedicated javascript library is the js-lite mandate for the TPAC |
14:09 |
pastebot |
"Dyrcona" at 204.193.129.146 pasted "Random MARC::Charset Errors" (5 lines) at http://paste.evergreen-ils.org/12 |
14:16 |
Dyrcona |
I'm now trying 1.34 from CPAN to see if it fares better. |
14:22 |
Dyrcona |
There is no data, but bad data.... |
14:23 |
Dyrcona |
A bib record with a data field 009. 009 is a control field. Why that chokes MARC::Charset, I don't know, and why it survived my earlier runs through the file but dies now, I also don't know. |
14:24 |
|
tmccanna_ joined #evergreen |
14:26 |
rfrasur |
just talked w/ vendor interested in EG integration. interesting timing. |
14:28 |
rfrasur |
though I might have hurt his feelings when he said "we host Evergreen" and I said "lots of people do." |
08:56 |
csharp |
Dyrcona: thanks - that helps |
09:17 |
Dyrcona |
really? that helps? |
09:18 |
Dyrcona |
mcooper isn't back. |
09:18 |
Dyrcona |
I was gonna ask if his computer crashed trying to process a 4GB MARC with the command line xpath program. |
09:18 |
Dyrcona |
My guesstimate is that it would want 200GB of RAM. |
09:19 |
bshum |
Dyrcona: mcooper is in California, but maybe he's an early bird. |
09:23 |
csharp |
Dyrcona: it helps to know that the same behavior isn't universal |
09:50 |
Dyrcona |
we point to RT, 'cause nothing changes without a ticket. |
09:50 |
bshum |
Which happens to us all the time when the librarians don't always get the message from top down on circ policies. |
09:50 |
bshum |
So we get calls at different levels asking for different rules :( |
09:50 |
Dyrcona |
Ah well, back to perusing 9xx tags in a MARC file to see if I can figure out what they mean. |
09:50 |
csharp |
well tracking would've helped me a couple of years ago when I broke circ *during the day* and had to quickly fix it ;-) |
09:50 |
Dyrcona |
So, have a ticketing system, and say "No tickee, no changee." |
09:51 |
bshum |
Dyrcona: We do that too. And still nobody gets it. |
12:06 |
jeff_ |
and i think that solves the mystery. |
12:06 |
rfrasur |
jeff_++ |
12:11 |
|
jihpringle joined #evergreen |
12:14 |
Dyrcona |
@later tell mcooper Did your computer blow up using the xpath program on that 4GB file of MARC records? |
12:14 |
pinesol_green |
Dyrcona: The operation succeeded. |
12:31 |
|
mcooper joined #evergreen |
12:33 |
|
jdouma_ joined #evergreen |
09:32 |
tsbere |
csharp: Wonderful idea. But I am not running around like a nutcase to all the clients, nor am I going anywhere near the three machines I know of that are also talking to an exchange server. >_> |
09:34 |
csharp |
heh |
09:43 |
|
collum joined #evergreen |
10:00 |
Dyrcona |
gmcharlt: I'm getting "no mapping found for [0xFC] at position ..." for several records while trying to convert from MARC8 to UTF8 using MARC::Charset. Is that a known issue? |
10:02 |
Dyrcona |
Heh. Googling the other message that I get turns up results about black holes. |
10:02 |
Dyrcona |
http://lmgtfy.com/?q=seem+to+have+fallen+through+in+_process_escape() |
10:05 |
Dyrcona |
Google says it is. |
10:53 |
Dyrcona |
Ah, nice. |
11:04 |
Dyrcona |
My problem seems to be with MARC8 records that have latin 1 characters in them. |
11:06 |
dbs |
That's an oldie but a goodie |
11:08 |
Dyrcona |
Should I try yaz-iconv instead of MARC::Charset? |
11:08 |
|
rfrasur joined #evergreen |
11:09 |
dbs |
In the past, I've written my own find/replace subroutine for problems like that to run before the conversion :/ |
11:09 |
rfrasur |
(good morning all) |
11:09 |
Dyrcona |
I wonder if there are any MARC8 records, and if they're not all Latin 1. |
11:10 |
dbs |
see http://goo.gl/CMBjm for a horrible example. |
11:15 |
|
_zerick_ joined #evergreen |
11:16 |
Dyrcona |
So, looks like the MARC file is in Latin-1 in its entirety. |
11:16 |
|
zerick joined #evergreen |
11:18 |
|
dboyle joined #evergreen |
11:19 |
dbs |
yaz-marcdump to marctxt, then iconv it from Latin-1 to UTF8, and yaz-marcdump it back to MARC21 with ldr 9 shifted to 'a'? |
11:22 |
Dyrcona |
yaz-iconv -f iso8859-1 -t marc8 mvlcmarc.dat | perl ~/Src/perl/marccnv.pl > mvlcmarc.utf8 |
11:23 |
Dyrcona |
Since the authority records all came from the same vendor's ILS, I'm going to assume that they are in Latin-1 also. |
11:24 |
Dyrcona |
I'm working on bibs, now, and authorities later. |
11:24 |
Dyrcona |
Anyone seen this before? seem to have fallen through in _process_escape() at /usr/share/perl5/MARC/Charset.pm line 445. |
11:24 |
Dyrcona |
That's the one that turns up Google results for black holes. |
11:25 |
dbs |
Dyrcona: which version of MARC::Charset is that? |
11:25 |
|
acoomes joined #evergreen |
14:17 |
rfrasur |
bshum: yes |
14:19 |
rfrasur |
yay! subdomain! |
14:19 |
rfrasur |
(that was a little more enthusiastic than it probably had to be) |
14:20 |
* Dyrcona |
is in a maze of 19,679 MARC records, each one just like the last. |
14:21 |
rfrasur |
Dyrcona: that sounds like the makings of a Pink Floyd video |
14:23 |
bshum |
moodaepo: FYI, I just updated the wordpress install on production to the latest release. Along with plugin and theme updates. |
14:24 |
bshum |
Oh nice..... |
17:05 |
pastebot |
"hopkinsju" at 204.193.129.146 pasted "Ok, now this is weird..." (58 lines) at http://paste.evergreen-ils.org/31 |
17:07 |
gmcharlt |
Dyrcona: did you happen to make the binary MARC record available? I'm lhappy to add test cases for MARC::Charset |
17:08 |
Dyrcona |
gmcharlt: I didn't because I was able to get mostly what I needed from yaz-marcdump, plus I've been having fun with other things today. |
17:08 |
Dyrcona |
gmcharlt: I think this file is a mix of records in latin 1 and marc 8. |
17:08 |
gmcharlt |
I'd appreciate having it if you get a spare moment |
17:09 |
Dyrcona |
OK. |
17:09 |
gmcharlt |
gotta keep that bestiary stocked ;) |