| 11:19 |
lualaba |
how can be there bad records? also i try to download here http://wiki.evergreen-ils.org/doku.php?id=evergreen-admin:importing:bibrecords |
| 11:20 |
Dyrcona |
lualaba: You did follow the suggests of the bold text in read at the top of the second URL, right? |
| 11:20 |
* Dyrcona |
cannot type today. |
| 11:21 |
Dyrcona |
lualaba: Also, the Gutenberg records are already binary MARC, you don't have to do anything to convert them, except bunzip them. |
| 11:21 |
lualaba |
note that is older instruction? |
| 11:22 |
lualaba |
and how to import in db binary MArc records? |
| 11:23 |
Dyrcona |
lualaba: Try Vandelay. I was able to import the Gutenberg records through Vandelay last time I tried in 2012. |
| 11:24 |
Dyrcona |
The client timed out, but the import eventually finished. |
| 11:24 |
lualaba |
i use version 2.8.1 |
| 11:25 |
Dyrcona |
Vandelay is the "MARC Batch Import/Export" option on the cataloging menu. |
| 11:25 |
Dyrcona |
The documentation should tell you all you need to know. |
| 11:25 |
lualaba |
i know but there is any limit or need time? |
| 11:26 |
lualaba |
after 1000 records i don't see any progress |
| 08:13 |
jboyer-isl |
I'd @quote that if I could. |
| 08:39 |
|
rlefaive joined #evergreen |
| 08:47 |
|
Dyrcona joined #evergreen |
| 08:53 |
Dyrcona |
Well, that's nice: Argument "The" isn't numeric in integer division (/) at /usr/share/perl5/MARC/Record.pm line 407. |
| 08:54 |
Dyrcona |
That's not from MARC export, so I guess I'll need to trap that and see what record produced it. |
| 08:54 |
|
maryj joined #evergreen |
| 08:57 |
Dyrcona |
Hmm. Might be from marc_export after all.... |
| 09:05 |
Dyrcona |
So, coming from an insert_grouped_field call in marc_export.... |
| 09:06 |
Dyrcona |
Ah, when adding items on line 473. |
| 09:06 |
Dyrcona |
The record must have a bad field. |
| 09:16 |
Dyrcona |
Warning from bibliographic record 1635630: Argument "The" isn't numeric in integer division (/) at /usr/share/perl5/MARC/Record.pm line 407. |
| 09:16 |
Dyrcona |
is a lot more useful. :) |
| 09:17 |
jboyer-isl |
What is it doing that there would be any math done at all, never mind math done on fields that haven't been checked for numeric-ness? |
| 09:18 |
jboyer-isl |
(I suppose I could look that up, what with the line numbers right there.) |
| 09:18 |
Dyrcona |
Line 407 of MARC::Record is in the insert_grouped_fields method. |
| 09:18 |
Dyrcona |
It is doing the math to determine where the inserted field(s) belong(s). |
| 09:19 |
Dyrcona |
That record has a summary field (should probably be a 520?) with a tag of 'The'. |
| 09:20 |
Dyrcona |
@marc 520 |
| 09:20 |
pinesol_green |
Dyrcona: Unformatted information that describes the scope and general contents of the materials. This could be a summary, abstract, annotation, review, or only a phrase describing the material. (Repeatable) [a,b,u,3,6,8] |
| 09:20 |
Dyrcona |
Yep, that looks to me what it ought to be, but I'll let the catalogers determine that. |
| 09:21 |
jeff |
@marc The |
| 09:23 |
Dyrcona |
"Wild Snow Sprout," eh.... |
| 09:23 |
* Dyrcona |
looks at the rain out the window. |
| 09:25 |
Dyrcona |
And RT ticket 5144 created.... |
| 09:25 |
Dyrcona |
Hmm. I made a branch to make that change. Maybe I should trap warnings around all calls to MARC::Record in marc_export and then make a LP bug? |
| 09:29 |
Dyrcona |
Oh, I see what happened.... |
| 09:29 |
Dyrcona |
The tag is The |
| 09:29 |
Dyrcona |
in1 is A and ind2 is d |
| 09:32 |
Dyrcona |
It's what we call a "brief" record. It will get overlaid from OCLC eventually. |
| 09:32 |
|
mrpeters joined #evergreen |
| 09:32 |
Dyrcona |
It has the local 590. |
| 09:33 |
Dyrcona |
@marc 550 |
| 09:33 |
pinesol_green |
Dyrcona: Information about the current and former issuing bodies of a continuing resource. (Repeatable) [a,6,8] |
| 09:33 |
Dyrcona |
@marc 650 |
| 09:33 |
pinesol_green |
Dyrcona: A subject added entry in which the entry element is a topical term. (Repeatable) [a,b,c,d,e,v,x,y,z,2,3,4,6,8] |
| 09:33 |
Dyrcona |
@marc 500 |
| 09:33 |
pinesol_green |
Dyrcona: General information for which a specialized 5XX note field has not been defined. (Repeatable) [a,3,5,6,8] |
| 09:34 |
* Dyrcona |
is trying to remember what field the titles of a compilation go into. |
| 09:34 |
Dyrcona |
That's the field this should be. |
| 09:39 |
|
maryj_ joined #evergreen |
| 09:40 |
Dyrcona |
csharp++ |
| 09:40 |
Dyrcona |
heh. |
| 09:40 |
Dyrcona |
@blame [marc The] |
| 09:40 |
pinesol_green |
Dyrcona: unknown tag The is why we can never have nice things! |
| 09:40 |
tsbere |
@blame [quote random] |
| 09:40 |
pinesol_green |
tsbere: Quote #62: "< Dyrcona> À propos a migration from TLC: If you have a column called TOTALINHOUSEUSES you should also have TOTALOUTHOUSEUSES must eat cottage cheese! for symmetry's sake." (added by csharp at 11:49 AM, July 22, 2013) |
| 12:40 |
Dyrcona |
I imagine the author pwns one fo these: https://plus.google.com/+ReverendEricHa/posts/Qn4aTEytdqn?pid=6231152009976367506&oid=103046039519355433778 |
| 12:41 |
jeff |
csharp: actually, you'll want to add a criteria to attempt to avoid invalid xml. |
| 12:44 |
jeff |
SELECT id FROM biblio.record_entry WHERE xml_is_well_formed(marc) AND xpath_exists('//marc:record/marc:datafield[@tag="505"]/marc:subfield[@code="t"]', marc::XML, ARRAY[ARRAY['marc', 'http://www.loc.gov/MARC21/slim']]); |
| 12:44 |
Dyrcona |
jeff: have you seen much invalid xml in your marc records? |
| 12:44 |
jeff |
found at least one just now. |
| 12:46 |
Dyrcona |
I'm running select id from biblio.record_entry where not xml_is_well_formed(marc) on my development database right now to see what I find. |
| 12:46 |
phasefx |
hrmm, there should be a a_marcxml_is_well_formed trigger on bre |
| 12:47 |
jeff |
immediate 500 error on supercat marcxml retrieval, mods takes a bit to return an empty collection, standard catalog page returns quickly (but mostly broken), and the MARC Record view in the catalog seems to take a while too. |
| 12:48 |
jeff |
delays might be unrelated, but i wonder if something gets... stuck. |
| 15:11 |
* tsbere |
hasn't taken a close look though |
| 15:11 |
jboyer-isl |
kitteh_: Dyrcona's right, we're looking at brining in a Polaris system into our existing system. Just looking at the marc there are bizarre holdings fields that may or may not require copious coding to codify. (i.e. it stinks.) |
| 15:11 |
tsbere |
jboyer-isl: Is it at least valid MARC? |
| 15:12 |
Dyrcona |
jboyer-isl: Very often, you can configure the MARC holdings fields before export, but 1) I don't know if Polaris can do that and 2) if it can, I don't know if the people giving you the data know how to do that. |
| 15:12 |
jboyer-isl |
tsbere: yes, but there are multiple holdings tags with varying formats. I'm not sure how that was done, but un-doing it is going to be most of the work. |
| 15:13 |
Dyrcona |
jboyer-isl: That smells of the remnants of previous migrations. |
| 15:13 |
jboyer-isl |
Dyrcona: That's what I'm hoping. (Well, what I was really hoping for was someone to say "Oh, you probably need to have them BLAH" where BLAH isn't 3 dozen config changes) |
| 10:44 |
kmlussier |
gmcharlt: Sounds good to me. Thanks! |
| 10:44 |
kmlussier |
I will do so for two bugs now while I'm thinking of it. |
| 10:45 |
Dyrcona |
And, grr. That string_agg trick in the having is just not working.... Time to try something else. |
| 10:46 |
* Dyrcona |
is trying to find all copies with a certain marc type at a library. |
| 11:03 |
jboyer-isl |
Dyrcona: care to paste the query? |
| 11:03 |
Dyrcona |
jboyer-isl: I will paste it, but got it working with a second join. |
| 11:04 |
pastebot |
"Dyrcona" at 64.57.241.14 pasted "Copies by item type by lib" (15 lines) at http://paste.evergreen-ils.org/18 |
| 16:32 |
Dyrcona |
;) |
| 16:33 |
jeff |
<controlfield tag="007">sz zunznnnzneu </controlfield> |
| 16:33 |
jeff |
interesting. |
| 16:33 |
Dyrcona |
Well, it's marc and those spaces may have meaning. |
| 16:33 |
Dyrcona |
@marc 007 |
| 16:33 |
pinesol_green |
Dyrcona: This field contains special information about the physical characteristics in a coded form. The information may represent the whole item or parts of an item such as accompanying material. (Repeatable) [] |
| 16:33 |
jeff |
longest after trimming trailing whitespace is: <controlfield tag="007">snbdncrndbnesnfmngenhnninnjmnkpnlunmn</controlfield> |
| 16:34 |
Dyrcona |
That just looks like someone passed out on the keyboard. That can't possibly mean anything. :) |
| 16:34 |
Dyrcona |
But, it's MARC, so it probably does mean something. |
| 16:36 |
jeff |
i actually found a video of the moment that 007 was placed on file: http://i.imgur.com/0RS6ND5.gif |
| 16:38 |
Dyrcona |
jeff++ |
| 16:38 |
Dyrcona |
heh |
| 10:55 |
Dyrcona |
Anyhoo. |
| 10:56 |
Dyrcona |
I don't think I've seen any records like that for a long time. |
| 10:59 |
jboyer-isl |
What appears to have happened: Saving that record stripped out a bunch of "empty" fields and subfields. (like so: <datafield tag="100" ind1="" ind2=""><subfield code="a"/><datafield> ) and that is the only real change I can see. (I don't suspect any unusual characters or whitespace issues, it was created in the client "by hand" as far as I can tell.) |
| 11:00 |
Dyrcona |
Ah. We sometimes get errors from MARC export about records with empty subfields. |
| 11:00 |
Dyrcona |
That may be a bug of creating records from templates. |
| 11:00 |
Dyrcona |
I've never looked into it, though. |
| 11:01 |
jboyer-isl |
Which is likely long fixed, because we'd be hitting this constantly if it was still happening. |
| 11:01 |
Dyrcona |
Well, *I* still see it, and it is always on newly-created records before central site catalogers overlay the records from OCLC. |
| 11:03 |
jboyer-isl |
(I have at times wished catalogers could save records with lots of empty fields; that way they could create new templates in the editor and all I'd have to do is select the data out of them. Time for an LP about in-db templates, perhaps.) |
| 11:03 |
jboyer-isl |
Dyrcona: newly created as in newly imported, or are staff able to save records with blank fields? |
| 11:04 |
Dyrcona |
Created from a MARC template, so created and saved in the client. |
| 11:04 |
Dyrcona |
I'm searching my trash folder to see if it has turned up in any recent dumps for EBSCO. |
| 11:04 |
Dyrcona |
I get something like this: Field 520 must have at least one subfield at /usr/share/perl5/MARC/File/XML.pm line 481. |
| 11:06 |
Dyrcona |
I should merge cf4410e & a9d80dc into my production branch. |
| 11:07 |
pinesol_green |
[evergreen|Jason Stephenson] LP 1502152: Improve marc_export warnings. - <http://git.evergreen-ils.org/?p=Evergreen.git;a=commit;h=cf4410e> |
| 11:07 |
pinesol_green |
[evergreen|Galen Charlton] LP#1502152: (follow-up) fix a typo - <http://git.evergreen-ils.org/?p=Evergreen.git;a=commit;h=a9d80dc> |
| 11:27 |
Dyrcona |
jeff: (add-hook 'sql-mode-hook (lambda () (setq indent-tabs-mode nil tab-width 4))) works pretty well for me. :) |
| 11:28 |
|
stonerl left #evergreen |
| 11:33 |
Dyrcona |
Hrm. Perhaps I misspoke about the 520. |
| 11:33 |
Dyrcona |
@marc 520 |
| 11:33 |
pinesol_green |
Dyrcona: Unformatted information that describes the scope and general contents of the materials. This could be a summary, abstract, annotation, review, or only a phrase describing the material. (Repeatable) [a,b,u,3,6,8] |
| 11:34 |
Dyrcona |
Yep. I did. |
| 11:34 |
Dyrcona |
@marc 540 |
| 11:34 |
pinesol_green |
Dyrcona: Terms governing the use of the described materials (e.g., copyrights, film rights, trade rights) after access has been provided. (Repeatable) [a,b,c,d,u,3,5,6,8] |
| 11:35 |
Dyrcona |
@marc 590 |
| 11:35 |
pinesol_green |
Dyrcona: unknown tag 590 |
| 11:37 |
Dyrcona |
Well, either I've guessed the wrong record, or I don't know what happened to it. |
| 11:47 |
Dyrcona |
Ah ha! |
| 11:47 |
Dyrcona |
Failure to map a character to MARC-8 in the 520 resulted in the record being skipped and erroring out. |
| 11:48 |
Dyrcona |
Looks like there's a UTF-8 em dash in the 520. |
| 11:48 |
* Dyrcona |
wishes vendors would just accept UTF-8 MARC records already. |
| 11:52 |
|
jwoodard2 joined #evergreen |
| 12:14 |
|
Christineb joined #evergreen |
| 12:21 |
|
jihpringle joined #evergreen |
| 12:17 |
Bmagic |
on the 035 |
| 12:17 |
bshum |
@marc 035 |
| 12:17 |
pinesol_green |
bshum: A control number of a system other than the one whose control number is contained in field 001 (Control Number), field 010 (Library of Congress Control Number) or field 016 (National Bibliographic Agency Control Number). (Repeatable) [a,z,6,8] |
| 12:17 |
Dyrcona |
@marc 040 |
| 12:17 |
pinesol_green |
Dyrcona: The MARC code for or the name of the organization(s) that created the original bibliographic record, assigned MARC content designation and transcribed the record into machine-readable form, or modified (except for the addition of holdings symbols) an existing MARC record. These data and the code in 008/39 (Cataloging source) specify the parties responsible for the bibliographic record. (1 more message) |
| 12:18 |
Bmagic |
if you are hand crafting a bib, and you put in a 001, the software will move that to 035 right? |
| 12:18 |
bshum |
Bmagic: It depends on how you use Evergreen. |
| 12:47 |
Dyrcona |
pdot2: To find items on library shelves. |
| 12:48 |
pdot2 |
ah, okay |
| 12:50 |
Dyrcona |
In Evergreen, asset.call_number serves as the link between biblio.record_entry and asset.copy. |
| 12:50 |
Dyrcona |
biblio.record_entry being the bibliographic (MARC) data about the book/item/what have you. |
| 12:51 |
Dyrcona |
and asset.copy being the physical manifestation in the form of a book, dvd, what have you. |
| 12:52 |
* bshum |
likes diagrams: https://docs.google.com/drawings/d/1EIcumPTGAwLgJvC9FgnYB9LjyWEKFn4rw-DW8ctO69k/edit?usp=sharing |
| 12:52 |
* jeffdavis |
wonders idly what Melvil Dewey would say to that explanation of what call numbers are for |
| 14:47 |
RoganH |
Yeah, I'm trying to see what I can work out when many of us have separate accounts. I'm starting to hit the point of saying they'll either have to be done by hand or we'll need to do development. |
| 14:47 |
kmlussier |
I think one consortium may be loading them with the MARC stream importer, but I'm not sure. I know they generally do ebook titles that way. |
| 14:48 |
RoganH |
I can think of ways of doing it on the backend but part of the goal is to put it in the catalogers' hands. |
| 14:48 |
Dyrcona |
I could be wrong, but I don't think the marc stream importer can do anything that vandelay doesn't. |
| 14:48 |
kmlussier |
Dyrcona: No, it can't. |
| 14:49 |
pdot2 |
so, MARC records appear to be overkill for the few hundred books I'm looking at, I'm assuming I can't get by with brief records for catalogs / circulation? |
| 14:49 |
RoganH |
pdot2: if you have a small collection and you need marc you can use something like the amazon 2 marc tool |
| 14:50 |
RoganH |
pdot2: http://chopac.org/cgi-bin/tools/az2marc.pl |
| 14:50 |
kmlussier |
pdot2: Brief MARC records? Sure. But you might want to make sure all of the fixed field information is there so that format icons and filters work properly. |
| 14:50 |
kmlussier |
+1 to the amazon 2 marc tool |
| 14:51 |
Dyrcona |
pdot2: You could also try grabbing records via z39.50, but +1 to amazon 2 marc. |
| 14:51 |
* Dyrcona |
goes back to fixing last night's mess. |
| 14:57 |
pdot2 |
amazon marc tool is pretty cool |
| 14:59 |
|
kitteh__ joined #evergreen |
| 15:59 |
pdot2 |
and restarted my client |
| 16:00 |
jeff |
you may have made changes that don't make sense, or you might be trying to assign your user to an org unit type that doesn't have "can_have_users" set to true. |
| 16:01 |
* Dyrcona |
wishes he had this feature right about now: http://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=11876 |
| 16:01 |
Dyrcona |
Or just a tool to diff marc. |
| 16:02 |
pdot2 |
allrighty, thanks Jeff, I'll try that out tomorrow |
| 16:03 |
miker |
Dyrcona: looks like a side-by-side view of breaker format, no? |
| 16:05 |
miker |
Dyrcona: ah ... reading further, they use a JS dif formatter to highlight differences. neat |
| 16:16 |
yboston |
worked great |
| 16:17 |
Dyrcona |
That could almost work in my case. |
| 16:17 |
Dyrcona |
yboston++ |
| 16:17 |
Dyrcona |
Anyway, we've decided to replace MARC on records where the quality dropped by 10 or more. |
| 16:17 |
jeff |
yboston++ handy and clever -- doesn't help Dyrcona's scenario, but close! :-) |
| 16:18 |
yboston |
:) |
| 16:18 |
Dyrcona |
We'll figure out what to do with the others later this week. |
| 08:22 |
Dyrcona |
Something must be cached. |
| 08:23 |
|
akilsdonk joined #evergreen |
| 08:24 |
|
ericar joined #evergreen |
| 08:25 |
Dyrcona |
On an unrelated note, I wish more vendors would accept UTF-8 in MARC records. |
| 08:26 |
Dyrcona |
'Cause I just should not be seeing messages like this in my email: no mapping found at position 0 in ⓟ2015. at /usr/share/perl5/MARC/Charset.pm line 384. |
| 08:27 |
Dyrcona |
Hrm.... That funkiness continues after a reboot! |
| 08:28 |
|
Newziky joined #evergreen |
| 08:30 |
Dyrcona |
Well, Google is no help..... I suspect this has something to do with python caching a compiled version somewhere. |
| 10:33 |
mmorgan |
I can think of one that applies here: New items can have a shorter loan period initially, then a longer loan period when they're no longer popular. |
| 10:33 |
gsams |
I can't say that I can figure one out for any of our libraries, that wouldn't be covered by some other option more easily. |
| 10:34 |
mmorgan |
Our circ modifiers are format based. book, dvd, etc. |
| 10:35 |
Dyrcona |
Our circ modifiers are modifiers: hot, bestseller, etc. We use MARC fields for DVDs, etc. |
| 10:35 |
* Dyrcona |
is kinda masochistic like that. ;) |
| 10:35 |
mmorgan |
No MARC fields here, just circ modifiers. |
| 10:36 |
Dyrcona |
What we've discussed with rule-based circ mods is encoding the loan duration, renewals, and holdability in the name somehow. |
| 12:15 |
tsbere |
dkyle1: From my docs, any of the following item forms will stop the book search filter/icon: a,b,c,f,o,q,r,s (d, for large print, only stops the icon) |
| 12:16 |
dkyle1 |
had a book that was displaying with an ebook icon, found that config.coded_value_map row id 587 has no corresponding coded_value entry in config.composite_attr_entry_definition |
| 12:16 |
tsbere |
dkyle1: ebooks show up with item_form set to one of o,q,s |
| 12:16 |
Dyrcona |
Check the MARC is the usual answer. |
| 12:17 |
tsbere |
dkyle1: MVLC's row 587 isn't ebook, it is "item_form=q", which the ebook and book icon_formats check indirectly. |
| 12:17 |
dkyle1 |
Dyrcona: indeed, and item_form showed blank in marc editor, but showed as q in metabib.record_attr_flat |
| 12:17 |
|
sandbergja joined #evergreen |
| 09:48 |
Dyrcona |
:) |
| 09:48 |
bshum |
Right :) |
| 09:56 |
|
jeff_ joined #evergreen |
| 09:56 |
Dyrcona |
csharp: On your marc export later, I thought we took care of that at one point, but I can't find a launch pad bug about it. |
| 09:58 |
Dyrcona |
Oh, wait, you're talking about the ones that are too large. |
| 09:58 |
Dyrcona |
That's not easy to fix, and might actually require changes to MARC::Record. |
| 09:58 |
Dyrcona |
Vendors should just a) accept MARCXML and b) ignore the size field. |
| 09:59 |
Dyrcona |
It's a throwback to the days of limited storage. |
| 09:59 |
Dyrcona |
With an end of record indicator in the binary format, it is also unnecessary. |
| 10:00 |
Dyrcona |
There's a lot of stupidity in the MARC format itself. |
| 10:00 |
jboyer-isl |
Dyrcona: it was designed for tape. If you know how long the record is you can skip ahead without looking for the EOR marker. |
| 10:00 |
Dyrcona |
Another option is just don't send them holdings, but that's the whole point of sending to OCLC. |
| 10:01 |
Dyrcona |
jboyer-isl: I know what it was designed for and it serverd that purpose at that time OK. It totally sucks today. |
| 10:14 |
Dyrcona |
True about the holdings. The original purpose was to send the tape and have cards printed for the card catalog. |
| 10:14 |
Dyrcona |
Everything else we've made the format do is tacked on. |
| 10:14 |
Dyrcona |
And, since we're not using linear storage, and it's 2015, vendors should just ignore the size field. |
| 10:15 |
Dyrcona |
Evergreen ignores it, as does anything that uses MARC::Record. |
| 10:15 |
Dyrcona |
"But this software has worked since 1985! Why should we change it?" |
| 10:36 |
|
dreuther joined #evergreen |
| 10:45 |
collum |
http://www.cincinnati.com/story/news/2015/03/20/appeals-court-ky-library-tax-legal/25076117/ |
| 14:15 |
Dyrcona |
I felt obligated to work on it when a Backstage import busted up someone's work. :) |
| 14:15 |
Dyrcona |
That's cool. Collaboration is nice. |
| 14:24 |
Dyrcona |
While testing that change, I noticed something unusual. |
| 14:25 |
Dyrcona |
Any time I typed a key in the marc editor, regular or flat text, while editing a record to import, a JavaScript error window would pop up. |
| 14:26 |
Dyrcona |
TypeError: tab is undefined |
| 14:26 |
Dyrcona |
Still doing it, actually. |
| 14:26 |
Dyrcona |
MARC edit works fine on regular bibs. |
| 14:27 |
Dyrcona |
Guess I'll do a git clean and see what happens. |
| 14:27 |
kmlussier |
dbs: Thanks for letting me know about mlnc4. I'll fix it up now. |
| 14:28 |
dbwells |
Dyrcona: I saw that too, when testing master with a 2.7 client. I know it stems from some changes ldw wrote a while back, and I thought it might be related to me needing a new client. |
| 11:08 |
tsbere |
Though I can't say much more than that because we were only running the code in production since last night. |
| 11:08 |
tsbere |
Oh, and we pushed the new view into production to stop the errors and are waiting to see if that makes new issues or not |
| 11:09 |
|
jihpringle joined #evergreen |
| 11:10 |
Dyrcona |
MARC types are not involved in the matchpoints that come up. |
| 11:11 |
Dyrcona |
Specifically, our matchpoints 2 and 314, not that that tells *you* anything useful. :) |
| 11:11 |
|
RoganH joined #evergreen |
| 11:20 |
bshum |
Hmm, maybe vr_format? |
| 11:28 |
bshum |
Gotcha... so I'd need to have some truly terrible bib records to try replicating some of that. |
| 11:28 |
jboyer-isl |
So is it looking like the lack of 008's + the left joins are causing the issue, or is that still an early assumption to make? |
| 11:28 |
|
vlewis joined #evergreen |
| 11:28 |
Dyrcona |
jboyer-isl: It looks like it is the left joins, definitely, as for what MARC tags are missing, I can't say. |
| 11:29 |
dbs |
Would the alternative to removing the left joins be COALESCE()ing the resulting nulls with some reasonable default? |
| 11:29 |
csharp |
sounds like we'll have similar issues then - we have lots of sh*tty data |
| 11:29 |
Dyrcona |
The records in question are createed with a leader, 005, 082, and 245 only. |
| 14:56 |
* tsbere |
is never sure when he is authed to the bot |
| 14:56 |
* kmlussier |
always forgets how to authenticate to the bot. |
| 14:56 |
bshum |
phasefx++ # my curiosity is sated |
| 14:57 |
Dyrcona |
@quote add <jboyer-isl> MARC breaks everything. It's the anti-ILS whisperer. |
| 14:57 |
pinesol_green |
Dyrcona: The operation succeeded. Quote #101 added. |
| 14:57 |
bshum |
Permissions are weird. End of line. |
| 14:57 |
kmlussier |
Dyrcona++ |
| 15:02 |
jboyer-isl |
Bummer. |
| 15:03 |
Dyrcona |
@quote search jboyer-isl |
| 15:03 |
pinesol_green |
Dyrcona: 4 found: #101: "<jboyer-isl> MARC breaks everything. It's the...", #76: "<jboyer-isl> Our copy location table is looking...", #83: "< jboyer-isl> PEBCAKEs, while delicious, rarely...", and #89: "<jboyer-isl> Assumption soup isn’t nearly as..." |
| 15:03 |
Dyrcona |
@quote search marc |
| 15:03 |
pinesol_green |
Dyrcona: 6 found: #101: "<jboyer-isl> MARC breaks everything. It's the...", #38: "<jcamins> At least your MARC frameworks aren't...", #46: "<_bott_> I am not a cataloger, but I speak...", #52: "<dbs> MARC is not machine readable.", #75: "< _bott_> I fear that MARC was a humorous...", and #77: "< Dyrcona> Sure, send someone binary MARC..." |
| 15:03 |
* bshum |
laughed out loud at #52 again. |
| 15:04 |
jboyer-isl |
I was trying to see if search and who would work together. Either it doesn't process them in the correct order, or you'd have to do it a dozen times to hit one. |
| 11:48 |
Dyrcona |
You could try using OpenILS::Utils::Normalize::clean_marc on the records before writing to your output file. |
| 11:49 |
csharp |
Dyrcona: is that employed by marc_export at all? |
| 11:49 |
csharp |
eeevil: interesting - I know there's a history there that I've heard in bits and pieces over the years |
| 11:50 |
Dyrcona |
charp: Apparently not. It is normally done when MARC is added to the database these days, but you may have older data or data added via SQL that bypasses that. |
| 11:50 |
eeevil |
csharp: royt and Thom Hickey were my prime targets, if you want a starting point, IIRC |
| 11:51 |
* Dyrcona |
wonders what is wrong with his fingers this morning. |
| 11:51 |
Dyrcona |
csharp: see above |
| 12:03 |
Dyrcona |
No. It's on my laptop. :) |
| 12:03 |
csharp |
no prob - I'm on a box that has that anyway |
| 12:03 |
Dyrcona |
I was going to maybe add some more things to it. |
| 12:03 |
Dyrcona |
I use when I want to mess with MARC on my laptop using DBI. |
| 12:03 |
csharp |
I wonder if that would be worth running on our full bib set at some point |
| 12:04 |
Dyrcona |
Might not be a bad idea. |
| 12:05 |
Dyrcona |
Something in Perl to pull the marc of each record, clean it, and update it. |
| 12:08 |
jeff |
it would be nice if OCLC didn't charge an arm and a leg to update your holdings with them. the only recent holdings we have are where we've gotten the MARC from them via CatExpress |
| 12:12 |
Dyrcona |
If you ever pull MARCXML from the database and it comes straight out with more than 1 line, then you should at least run clean_marc on those records. |
| 12:12 |
Dyrcona |
Multiple lines is a sure sign that the MARC was not "fixed" for Evergreen. |
| 12:13 |
csharp |
@fix ALL THE MARC |
| 12:13 |
pinesol_green |
csharp: Zoia knows how to make fusilli. |
| 12:14 |
Dyrcona |
What do you know, pinesol_green ? |
| 13:45 |
jeff |
in this case it was my own fault. i was eating MARC-8 with MARC::Record then writing out to a file that I had set as :utf8 |
| 13:46 |
Dyrcona |
Yep. |
| 13:47 |
jeff |
I'm tempted to transform these incoming records to UTF-8 using yaz-marcdump. |
| 13:47 |
Dyrcona |
But, I always seem to have the UTF-8 characters that won't map to MARC-8. |
| 13:48 |
Dyrcona |
When exporting to a vendor. |
| 13:48 |
Dyrcona |
jeff: Might be a good idea. |
| 13:48 |
Dyrcona |
I like using yaz-marcdump to convert binary MARC to xml and UTF-8 at the same time, even. :) |
| 13:49 |
jeff |
I thought that MARC::Record had more obvious transcoding support. Maybe I'm just thinking of MARC::File::Encode |
| 13:49 |
Dyrcona |
Often makes those "vendor records" easier for MARC::Record to digest. Just ignore the comments in the output. :) |
| 13:50 |
jeff |
no, if yaz-marcdump can't parse without errors/warnings, you reject the file ;-) |
| 13:51 |
Dyrcona |
Then, I'd almost never load anything. ;) |
| 13:51 |
jeff |
last if $outcount >= 50; |
| 08:29 |
|
collum joined #evergreen |
| 08:41 |
|
mmorgan joined #evergreen |
| 08:44 |
csharp |
hmmm - I'm seeing an issue with marc_export where I feed it a list of 30K bib ids, it processes around 400, then dies with "substr outside of string at /usr/share/perl5/MARC/Record.pm line 568." |
| 08:44 |
Dyrcona |
Bad MARC data. |
| 08:45 |
Dyrcona |
If it comes from MARC::Record, it's converting the record at that point. |
| 08:45 |
csharp |
interestingly, when I begin the next file at line 401 (for instance), it processes 400 more, then dies again |
| 08:45 |
csharp |
I'm having trouble isolating the source |
| 08:46 |
Dyrcona |
Well, I've never seen that and I regularly dump almost 1 billion bibs. |
| 08:46 |
csharp |
1 MILLION DOLLARS! |
| 08:47 |
Dyrcona |
A simple way? I don't think so. |
| 08:47 |
csharp |
ok - that's consistent with my reading of the marc_export code |
| 08:50 |
Dyrcona |
You might try outputting records as bre, but that bypasses the conversion to MARC. |
| 08:51 |
csharp |
I'm going to go brute force and import each id individually until I hit the bad record |
| 08:52 |
|
DPearl joined #evergreen |
| 08:52 |
|
mrpeters joined #evergreen |
| 08:54 |
Dyrcona |
csharp: Or are you using the old one? |
| 08:55 |
phasefx |
csharp: just curious, are you leaving things as xml? |
| 08:55 |
Dyrcona |
Yeah, I was wondering about that, myself. It's probably blowing up around line 515. |
| 08:56 |
Dyrcona |
That's the call to MARC::Record->as_usmarc, and it's the one MARC::Record call not wrapped in an eval block. |
| 08:59 |
csharp |
Dyrcona: I'm using the new one |
| 08:59 |
csharp |
I'm outputting USMARC |
| 09:00 |
Dyrcona |
csharp: I'll post a patch that you could apply to your marc_export that should output the id of the failed records and continue exporting, unless you want the export to stop. |
| 11:45 |
jeff |
@later tell krvmga the responsive-web one is pretty much from scratch (with some jQuery). if we decide to continue down that path we might incorporate angularjs. I'm also interested in eliminating/reducing the role/need of the Rails app that currently works as a JSON<->TPAC screen scraping gateway. |
| 11:45 |
pinesol_green |
jeff: The operation succeeded. |
| 11:46 |
jeff |
Since they're experiments, there are a lot of things in there where we've done it, but learned enough to know that we would want to do it differently next time. :-) |
| 12:18 |
Dyrcona |
@marc 505 |
| 12:18 |
pinesol_green |
Dyrcona: The titles of separate works or parts of an item or the table of contents. The field may also contain statements of responsibility and volume numbers or other sequential designations. (Repeatable) [a,g,r,t,u,6,8] |
| 12:25 |
Dyrcona |
Perl code like this always looks "wrong" to me, even when it is in my own code: substr($_, 26, 1) = 'i'; |
| 12:44 |
|
mtcarlson joined #evergreen |