09:08 |
Dyrcona |
The Joy of Software -- for certain definitions of joy. |
09:12 |
csharp |
Dyrcona: does the new marc_export script output any information about records that are larger than 99999 bytes? the old one just outputs "Record length of 223776 is larger than the MARC spec allows" (for example) from MARC::Record |
09:12 |
csharp |
we're trying to identify the records that end up being too big when doing bibs + holdings |
09:12 |
Dyrcona |
csharp: It might output to standard error. IIRC, MARC::Record does that. |
09:13 |
csharp |
okay - I figured that was the case - thanks |
09:13 |
bshum |
csharp: I think that was just a warning though, I thought it still attempted to output the record anyways. |
09:13 |
Dyrcona |
I get a lot of messages about bad character conversions to standard error when I run it. |
09:14 |
* bshum |
remembers seeing huge records go through somewhere, sometime. |
09:14 |
Dyrcona |
bshum csharp: Last time that I looked into it, the record that is too big gets output with size set to maximum, 99999. |
09:14 |
* bshum |
hates that the warning is the size, not the bib ID |
09:15 |
Dyrcona |
I recently had fun with a vendor who calculated record size using characters, while the standard and MARC::Record count bytes. |
09:15 |
Dyrcona |
characters <> bytes |
09:16 |
Dyrcona |
bshum: I'm not sure the bib id can be reliably determined at the level where the error occurs. |
09:16 |
Dyrcona |
I suppose marc_export could be modified to check for the error and alter the output. |
11:24 |
bshum |
Hmm I think it's the usual on same marc force reingest deal |
11:26 |
Dyrcona |
bshum: But, I also run a query reingest metabib.record_attributes, mainly for deleted records, but it does them all to be safe. |
11:28 |
Dyrcona |
bshum: Something like http://git.mvlcstaff.org/?p=jason/evergreen_utilities.git;a=blob;f=scripts/pingest.pl;h=2b43e9a45b20e10325217f86579002b5c0e14315;hb=cc9a96dc734997946c8027d41c9252a545dc05a8 ? |
11:28 |
Dyrcona |
I don't think you have to set the reingest on same marc if you use the above. |
11:37 |
bshum |
Dyrcona: Cool! I'll ponder that one and see what it does. Thanks for sharing. |
11:42 |
dbwells |
bshum: We've mostly kept it pretty simple and just generated batches of UPDATE commands (similar to the 2.6 example), then split them up and run them on multiple psql (or perl DBI) processes. |
11:43 |
dbwells |
bshum: In our experience, as long as you don't use tranactions or try to do UPDATEs on ranges of IDs, we don't have any locking problems. |
12:13 |
* pinesol_green |
fills a pint glass with Magic Hat Blind Faith, and sends it sliding down the bar to Dyrcona (http://beeradvocate.com/beer/profile/96/298/) |
12:23 |
* Dyrcona |
is feeling lazy so I'll ask here. |
12:23 |
|
tonyb_ohionet joined #evergreen |
12:23 |
Dyrcona |
Is there a way to have MARC::Record accept a malformed record similar to MARC::Batch's strict_off option? |
12:24 |
Dyrcona |
I know I can use an eval block to catch the error, but I actually want to delete the offending tag. |
12:44 |
eeevil |
Dyrcona: I suspect it depends on the malformedness... do you know the issue? |
12:44 |
eeevil |
or is it just "record fails. goodbye" |
12:45 |
|
ldwhalen joined #evergreen |
12:45 |
eeevil |
ah ... I wonder if yaz can read it? (you've probably attempted that...) |
12:46 |
eeevil |
also, marcxml? |
12:46 |
Dyrcona |
I am pulling the marc right out of our database. |
12:47 |
Dyrcona |
Weird thing, with the eval block, $@ is not getting set. |
12:48 |
Dyrcona |
I am using Perl 5.18 if that matters. |
12:48 |
Dyrcona |
Yes, it is marcxml. |
12:58 |
gmcharlt |
bshum: as you can see, I've established a "deprecation" tag as well |
12:58 |
bshum |
gmcharlt++ |
12:59 |
jeff |
deprecation++ excision++ |
12:59 |
Dyrcona |
Interestingly, $MARC::Record::ERROR does not appear to get set as the documentation says it will: Use of uninitialized value $MARC::Record::ERROR in print at Src/egmisc/laserdisc_fix.pl line 39 |
13:00 |
jeff |
Dyrcona: i believe that's a difference between parsing with MARC::File::USMARC and MARC::File::XML |
13:01 |
eeevil |
Dyrcona: gotcha ... your problem sounds familiar, but I can't find its like on LP |
13:01 |
Dyrcona |
eeevil: It might be on CPAN's RT... |
13:05 |
jeff |
Ideally they wouldn't get into biblio.record_entry.marc if they're malformed. I don't know if that's just imported or older record data. I haven't tested to see if you can sneak bad data in in current Evergreen. |
13:10 |
|
hbrennan joined #evergreen |
13:16 |
Dyrcona |
bshum++ |
13:25 |
Dyrcona |
On my MARC::Record error: I can print $@ to get the error message. |
13:25 |
Dyrcona |
Just for those following along at home. :) |
13:28 |
Bmagic |
Dyrcona: This has been something that I have been dealing with as well, I would be interested in seeing what you come up with |
13:30 |
pastebot |
"Dyrcona" at 64.57.241.14 pasted "What I'm doing about my bad record(s) for now." (53 lines) at http://paste.evergreen-ils.org/10 |
09:21 |
jl- |
morning |
09:22 |
jl- |
voyager besides authority and bib records, voyager also gave me 'item' records and 'mfhd' records.. do I need to treat them specially or can I import them as usual? |
09:26 |
|
eby__ joined #evergreen |
09:27 |
Dyrcona |
mfhd records should follow a standard and have the information in the MARC. The old export tools could supposedly import them. |
09:27 |
Dyrcona |
Item records could be anything at all. |
09:29 |
jl- |
Dyrcona: thanks. I'm just now seeing that the item file is .txt and not .mrc |
09:29 |
jl- |
lol |
09:38 |
jl- |
and thanks for the koha link jcamins, that is the next ILS we are wanting to test |
09:39 |
jcamins |
There are other versions of the migration toolkit, so you might want to look around to see if anyone has anything better. |
09:39 |
dbs |
jl-: mfhd for serials may end up going into serial.record_entry if your library doesn't circulate your serials |
09:39 |
Dyrcona |
jl-: Looking at the code that jcamins pointed out, it appears to expect some item information in the bibliographic (MARC) records. That gets matched up with other information from items.txt later. |
09:40 |
|
yboston joined #evergreen |
09:40 |
mrpeters |
jl: yeah, that can be converted to tab delimited real easy (that's what i would do) |
09:40 |
mrpeters |
did it include any headers? |
10:08 |
RoganH |
There is a learning curve with Evergreen especially in terms of complexity if you're coming from an older product like Athena that was just a few steps removed from a flat text file database. |
10:08 |
RoganH |
The marc records can be completely fine but that doesn't mean there are any call number or copy records present. |
10:08 |
RoganH |
Copy records attach to call number records which attach to bib records (basically) |
10:08 |
Dyrcona |
ats_JC: How did you import the MARC records into Evergreen? |
10:09 |
ats_JC |
we extracetd the data sets from athena |
10:10 |
ats_JC |
then we let our IT staff to do the conversion |
10:10 |
ats_JC |
he followed the steps on the documentary present on the evergreen website |
10:12 |
Dyrcona |
Depending on the method used to import the records, there are different ways to get the records for call numbers and copies created. |
10:12 |
ats_JC |
ahaha |
10:12 |
ats_JC |
noted |
10:13 |
Dyrcona |
No, you can possibly have them made after the fact if you know what MARC fields and subfields hold the necessary information. |
10:13 |
Dyrcona |
It will take some custom code, though. |
10:14 |
|
mceraso joined #evergreen |
10:14 |
ats_JC |
thank you sir. ill try to let him join the conversation |
10:30 |
|
kbeswick joined #evergreen |
10:33 |
|
ats_JC joined #evergreen |
10:39 |
ats_JC |
hi guys. how can we record new books into the evergreen? |
10:41 |
Dyrcona |
Well, you could have your staff catalog everything by hand, but it sounds like you already have the records loaded with copy information in some MARC field. |
10:42 |
Dyrcona |
You need to find out what field has that information and what fields in the asset.call_number and asset.copy tables the subfields correspond to. |
10:43 |
Dyrcona |
Then, you need to have someone write a program to extract the information from the marc in the biblio.record_entry table and to create the asset.call_number and asset.copy table entries. |
10:43 |
Dyrcona |
Piece of cake! ;) |
10:44 |
|
kbeswick joined #evergreen |
10:44 |
ats_JC |
oh thank you!!! :) |
10:05 |
pinesol_green |
Dyrcona: The current temperature in WB1CHU, Lawrence, Massachusetts is 28.9°F (10:05 AM EDT on March 27, 2014). Conditions: Clear. Humidity: 26%. Dew Point: -2.2°F. Windchill: 28.4°F. Pressure: 30.31 in 1026 hPa (Steady). |
10:12 |
|
RoganH joined #evergreen |
10:14 |
jl- |
is the record id in datafield 010 here? http://paste.debian.net/89896/ |
10:15 |
Dyrcona |
@marc 010 |
10:15 |
pinesol_green |
Dyrcona: A unique number assigned to a MARC record by the Library of Congress. Valid MARC prefixes for LC control numbers are published in MARC 21 Format for Bibliographic Data. [a,b,z,8] |
10:16 |
bshum |
I might have suspected the 001 was the record ID. With 003 being the source name of the system? |
10:17 |
bshum |
But maybe that's a lie, like most MARC seems to be. |
10:17 |
Dyrcona |
@marc 001 |
10:17 |
pinesol_green |
Dyrcona: The control number assigned by the organization creating, using, or distributing the record. The MARC code for the organization is contained in field 003 (Control Number Identifier). [] |
10:18 |
Dyrcona |
I'm not sure that SQP is a "valid" MARC code as assumed by LoC. It maybe an org unit identifier from the legacy system. |
10:19 |
Dyrcona |
bshum: The MARC documentation isn't a "lie," per se. It's just that many catalogers and vendors ignore standards when producing records. |
10:19 |
|
kbutler joined #evergreen |
10:19 |
Dyrcona |
Or, they use OCLC's slightly different ideas of what MARC should look like. |
10:20 |
Dyrcona |
Says the guy who routinely says, "Documentation lies." :) |
10:20 |
Dyrcona |
jl-: To summarize, the record id is more likely in the 001 as bshum suggests. However, I wouldn't rely on that. |
10:21 |
bshum |
Well, I meant to say that the MARC is a lie, meaning the record isn't formatted correctly. |
10:22 |
|
rjackson-isl joined #evergreen |
10:22 |
eeevil |
Dyrcona: now /that/ would be a great t-shirt quote |
10:22 |
eeevil |
"did you check the 035?" |
10:22 |
Dyrcona |
jl-: If you'll be merging records from different sources, I'd assign the incoming MARC records new ids on import. |
10:23 |
Dyrcona |
jl-: If you set the right flag, the 001 and 003 will get moved to the 035. |
10:25 |
Dyrcona |
jl-: In your example record, it would look something like (SQP) 15788, similar to the existing 035 with the OCLC number. |
10:26 |
|
fparks_ joined #evergreen |
10:37 |
jl- |
yes but I'm taking it one at a time |
10:37 |
jl- |
still on the first |
10:37 |
jl- |
;) |
10:38 |
Dyrcona |
OK. It is probably best to find if Voyager can export the record id to a given MARC field. That way you know with some certainty what it is. |
10:38 |
Dyrcona |
As bshum mentions, relying on the 001 is a bit iffy. |
10:39 |
jl- |
hmm ok, thanks |
10:40 |
Dyrcona |
001 is supposed to be unique per source (003), but I've seen at least one ILS that doesn't honor that. |
14:46 |
hbrennan |
it produces holdings info |
14:46 |
hbrennan |
I still haven't implemented it |
14:46 |
|
jeff_ joined #evergreen |
14:47 |
Dyrcona |
jl-: If you really want to get lost in the minutiae: http://www.loc.gov/marc/ |
14:47 |
|
jeff_ joined #evergreen |
14:47 |
hbrennan |
Dyrcona: haha. I'm bored.. think I'll just read MARC rules today |
14:48 |
jl- |
Dyrcona: I'd rather not, the question is if these need to be imported into evergreen |
15:11 |
Dyrcona |
Will yaz-marcdump do that? -- Put them one record per line? |
15:11 |
Dyrcona |
I've never bothered. |
15:11 |
jl- |
hmm, do they need to be put into one per line? |
15:12 |
Dyrcona |
I usually write something with MARC::Batch to read each record, then call an Evergreen utility function to flatten the marc before loading. |
15:12 |
Dyrcona |
jl-: It depends on how you are loading them. Some of the methods say to do that. |
15:12 |
Dyrcona |
You want the marcxml to be stripped and cleaned in the database. |
15:13 |
jl- |
Dyrcona: this is the method I was going to use docs.evergreen-ils.org/dev/_migrating_your_bibliographic_records.html |
15:54 |
jl- |
< Dyrcona> I usually write something with MARC::Batch to read each record, then call an Evergreen utility function to flatten the marc before loading. |
15:54 |
jl- |
could you expand? |
15:54 |
jl- |
:) |
15:56 |
Dyrcona |
OpenILS::Utils::Normalize has a function, clean_marc() that prepares a MARC record for the Evergreen database. |
15:57 |
Dyrcona |
You could write a script that uses MARC::Batch to read your MARC file and call that function on each record then print each record with a linefeed after it. |
15:57 |
Dyrcona |
Or, you could just load the records from your script. |
16:20 |
jl- |
thanks and good night |
16:42 |
dreuther_ |
eeevil: I am taking over the work on https://bugs.launchpad.net/evergreen/+bug/1152863 Would you be available sometime in the next day or so to answer a few questions I have? Thanks |
14:40 |
dave_bn |
csharp: is it possible to integrate those sources without importing the marc records from them? |
14:41 |
csharp |
dave_bn: I'll defer to others on that question |
14:41 |
csharp |
dave_bn: but you can include URLs to external resources in the MARC record - ($856) - those work |
14:42 |
Dyrcona |
dave_bn: You have to import MARC records. |
14:42 |
ldwhalen |
ktomita: I can reproduce both issues you have found. I will fix them, and update the commit by tongiht. |
14:42 |
ldwhalen |
ktomita++ |
14:43 |
dave_bn |
Dyrcona: if I import records, is there a chance to keep them updated? what if a book is removed from the external collection |
14:44 |
ktomita |
ldwhalen: I will try to take a look at the update today or tomorrow. |
14:44 |
dave_bn |
Dyrcona: and how would I know which marc records have changed or are new, e.g. from archive.org |
14:44 |
Dyrcona |
dave_bn: That depends on the vendor. Generally, I get update files that tell me which bibs to delete and will have any new bibs to add. |
14:44 |
Dyrcona |
dave_bn: Another useful thing is to create a MARC source for each vendor. |
14:45 |
Dyrcona |
dave_bn: Also note, I don't use the normal Evergreen tools for keeping the records up to date. I generally write my own. |
14:46 |
dave_bn |
Dyrcona: I see. Would it not be more convenient if external sources were integrated same way as google books? |
14:46 |
bshum |
Absolutely more convenient. |
10:41 |
jl- |
yes |
10:41 |
jl- |
that's probably what I should look into more |
10:42 |
Dyrcona |
I migrate data by writing a fresh set of scripts for each migration, because I've not done any where the input data was in a consistent format. |
10:43 |
Dyrcona |
Horizon's MARC data was interesting to say the least. |
10:43 |
phasefx |
there are scripts that will take a base table like asset.copy, and make an identically shaped table complete with the same sequences, like m_foo.asset_copy.And then, based on the incoming data, we'd make a m_foo.asset_copy_legacy that inherits from that table, with the legacy columns added in. Then the mapping process would be moving that data from the legacy columns to the stock columns |
10:43 |
phasefx |
(and other tables). Then when ready to push the data into the real tables, you just INSERT INTO asset.copy SELECT * FROM m_foo.asset_copy; and it just works |
10:44 |
Dyrcona |
I do all my stuff in Perl and just insert directly into asset.copy, etc. |
15:16 |
jl- |
would it be wrong to assume that it's the 84'th record? |
15:16 |
jl- |
in the .sql file ? |
15:16 |
jl- |
Dyrcona |
15:16 |
Dyrcona |
Yeah. The parts have a like to a biblio.record_entry which is a bilbiographic (MARC) record in the database. |
15:17 |
Dyrcona |
s/like/link/ |
15:18 |
jl- |
Dyrcona so I'm not exactly sure about the meaning of a foreign key constraint |
15:18 |
jl- |
does it mean that record 84 has an illegal char? |
11:14 |
jeff |
how did you send the records to them? |
11:15 |
jeff |
in terms of, i'm assuming they're MARC and not MARCXML... |
11:15 |
jeff |
how were they generated on your end? |
11:19 |
Dyrcona |
Using my Marque.pm branch and the usual rain dance with MARC::Record. |
11:20 |
Dyrcona |
Yes, they are MARC, not MARCXML. |
11:20 |
Dyrcona |
EBSCO prefers MARC, not MARCXML. |
11:22 |
eeevil |
just a guess, but perhaps m::r is counting the record separator as part of the preceding record, but marc4j is not |
11:23 |
eeevil |
(also, the marc specs are ambiguous WRT characters vs bytes, and use the terms interchangeably throughout) |
11:24 |
Dyrcona |
I'm wildly speculating on the marc4j only because I had trouble with it during our migration. It expected the size fields from Horizon to be correct and they were off by miles. |
14:26 |
gmcharlt |
tsbere: the pushme to your pullyou is libraries who want to display accumulated fines on their notices, after the fine generator has run |
14:27 |
gmcharlt |
though that may be an argument for scheduling it to run at more like 7 a.m., at least for small enough systems where you don't need all of the early morning to process notices |
14:31 |
|
pinesol_green` joined #evergreen |
14:32 |
Dyrcona |
Sure, send someone binary MARC records in a docx, why the Hell not!? |
14:32 |
bshum |
Are we putting that in Marque.pm? Cause if so, then I think we may have gone too far now.... ;) |
14:32 |
Dyrcona |
EBSCO sent me this. |
14:33 |
Dyrcona |
Or, rather sent it to a library director who sent it to someone else who finally sent it to me today. |
14:36 |
csharp |
docquex |
14:36 |
Dyrcona |
csharp++ |
14:36 |
senator |
\Dyrcona: at least it wasn't printed and faxed somewhere along the way |
14:37 |
Dyrcona |
I think I've seen that or very close, a printed an faxed screen shot of the MARC editor came my way once just after migration. |
14:37 |
* csharp |
often gets "screenshots" taken with a smarthphone camera |
14:37 |
csharp |
smartphone, even |
14:37 |
Dyrcona |
bshum: screamshots. :) |
14:51 |
berick |
yay, i guess |
14:52 |
jeff |
i found these to be most helpful: "^123 +(W.*)? *(Eleventh|11th)" and "^456 +(E.*)? *(Eighth|8th)" |
14:53 |
jeff |
match 123 W Eleventh St and 123 West Eleventh Street and 123 11th but not 123 W 11th Street, etc. |
14:58 |
Dyrcona |
I wonder what they'll do if I send them UTF-8, instead of trying to convert to MARC-8. |
15:08 |
|
ktomita_ joined #evergreen |
15:17 |
AaronZ-PLS |
Hmmm, apparently my earlier message didnt go through. |
15:17 |
AaronZ-PLS |
Is there a permission to control if a user (or group of users) can perform pre-cataloged checkouts? |
15:54 |
gmcharlt |
and ITEM_NOT_CATALOGED would be what one would grep for regarding pre-cats, as opposed to non-cats |
15:54 |
Dyrcona |
Yes, most likely. |
15:55 |
AaronZ-PLS |
Dyrcona:I agree. It would be useful to allow certian users to create pre-cat circs but not require that they have the same permissions as a cataloger. |
15:55 |
Dyrcona |
gmcharlt: I want to change the subject, and since you're responsible for MARC::Charset, you hopefully know the answer. |
15:55 |
* gmcharlt |
does have a preference, unless a patch is accompanying it right away, that bug reports list problems, maybe suggest solutions, but don't try to overdetermine the implementation of said solutions |
15:55 |
gmcharlt |
Dyrcona: I'm all ears |
15:55 |
* Dyrcona |
agrees that bug reports should detail problems and not define solutions. |
15:57 |
AaronZ-PLS |
csharp: We use a custom reporting interface which saves the data as a .xls file for the staff to use |
15:58 |
gmcharlt |
Dyrcona: well, decode() or decode_utf8() is in fact what needs to happen to get a UTF8 string, to make CORE::length count characters and not octets |
15:58 |
gmcharlt |
the question would be what the characer encoding of the octets are at the point where you're doing the decode |
15:59 |
Dyrcona |
It's a MARC record from our database, they should already be UTF-8. |
15:59 |
gmcharlt |
AaronZ-PLS: is the reporting interface capable of specifying that all selected fields should be text, not general? |
15:59 |
gmcharlt |
(when it outputs the spreadsheet) |
15:59 |
Dyrcona |
And the character in question is a musical flat symbol, which looks good on my UTF-8 terminal. |
16:00 |
|
hbrennan joined #evergreen |
16:02 |
Dyrcona |
It does, and I added a print to stderr that shows I'm not using set_leader_lengths correctly. |
16:03 |
Dyrcona |
Or, does as_usmarc call set_leader_lengths again? |
16:03 |
Dyrcona |
I'm trying to mangle the size of my MARC record in case you couldn't tell. |
16:05 |
gmcharlt |
as_usmarc does in fact call st_leader_lengths |
16:05 |
gmcharlt |
and if it's working correctly, should always be setting the leader and directory fields to be based on octets, not characters |
16:06 |
Dyrcona |
gmcharlt: It is, but ebsco apparently counts characters, not octets. |
16:29 |
dbwells |
Dyrcona: probably just a past problem, but your example record has an extra </datafield> after the 852 |
16:29 |
dbwells |
s/past/paste/ |
16:29 |
* Dyrcona |
doesn't care enough to bother with that. |
16:29 |
Dyrcona |
Straight out of the MARC documentation: Computer-generated, five-character number equal to the length of the entire record, including itself and the record terminator. The number is right justified and unused positions contain zeros. |
16:30 |
Dyrcona |
So, MARC doesn't say it it is octets or characters. |
16:30 |
Dyrcona |
"length. A measure of the size of a data element, field, or record and is expressed in number of octets." |
16:31 |
Dyrcona |
But they do, and EBSCO is wrong! |
16:32 |
Dyrcona |
I am now done with this conversation, and I'm making no changes for EBSCO. |
18:32 |
jtaylorats |
I'm probably not going to use PSQL for the next load but was curious. |
18:38 |
jtaylorats |
Couldn't find any reference to such an error anywhere. |
18:42 |
bshum |
I don't have much experience in that area, so I'm afraid you will have to wait for someone else. |
18:43 |
Dyrcona |
jtaylorats: How are you converting the MARC records to UTF-8 and what are they doing having WIN1252 characters in them? (As if I didn't know that all MARC records suck.) |
18:44 |
jtaylorats |
I thought they were in UTF-8 already. Using MarcEdit to convert them to MarcXML. Have been trying to figure out if I missed a step somewhere. Maybe I need to do something a bit more explicit during the prepping phase. Thought it was covered but apparently not. |
18:45 |
jtaylorats |
Hard to say why those characters are in there. |
18:45 |
Dyrcona |
Oh, its easy to say why the characters are in there: Most software that works with MARC records just plain sucks. |
18:46 |
Dyrcona |
When I load records I usually run them through some Perl code to convert them to MarcXML, convert the charset if necessary, and to "scrub" the records of control characters and other junk, first. |
18:47 |
jtaylorats |
Partly curious why the admin tool has no problem with the insert. Can't say for sure but I don't think it scrambled anything. |
18:47 |
jtaylorats |
I'll have to do some more checking. This is a test load and not worrying about it at the moment but need to cover that base before the next load. |
13:42 |
bshum |
Either way, probably good to poke at. |
13:44 |
pinesol_green |
[evergreen|Dan Scott] TPAC: Use indexed subfields only in author search links - <http://git.evergreen-ils.org/?p=Evergreen.git;a=commit;h=9f7b95c> |
13:44 |
pinesol_green |
[evergreen|Dan Scott] TPAC: Display authors using inline-block - <http://git.evergreen-ils.org/?p=Evergreen.git;a=commit;h=cf3b5e0> |
13:48 |
Dyrcona |
gmcharlt: Is MARC::File::XML 1.0.2 supposed to be available via CPAN, yet? (I think I may need to wait for my mirror to update.) |
13:49 |
gmcharlt |
Dyrcona: yes, it is, but definitely hasn't reached all the mirrors yet |
13:50 |
tsbere |
and the fact that our local mirror doesn't always update from the most up to date mirror, and even then only does so nightly, probably isn't helping |
13:50 |
bshum |
Do we need to restart anything to upgrade that? |
13:53 |
gmcharlt |
namely, because dependencies changed from 0.93 to 1.0 of MARC::File::XML, I don't know if it would hit wheezy |
13:54 |
csharp |
eww |
13:54 |
* csharp |
wonders if it will make it to 12.04 |
13:55 |
Dyrcona |
bshum: You should not have to restart anything after upgrading MARC::File::XML. |
13:56 |
jeff |
this makes it likely that the debian maintainer or debian security team would (at least attempt to) backport just the fix, leaving the version number the same. |
13:56 |
Dyrcona |
Only if you had the Perl modules preloaded in your PostgreSQL config, then you might have to restart PostgreSQL. |
14:01 |
gmcharlt |
there are other reasons to upgrade to 1.0.2, though -- speedier processing of MARCXML being the main one |
15:36 |
jeffdavis |
jeff: do your contracts with vendors using SIP include language about patron privacy/protection of personal info? |
15:36 |
csharp |
@quote search sip2 |
15:36 |
pinesol_green |
csharp: 1 found: #74: "< Dyrcona> SIP2 is not a suitable means for..." |
15:37 |
Dyrcona |
@quote search marc |
15:37 |
pinesol_green |
Dyrcona: 3 found: #38: "<jcamins> At least your MARC frameworks aren't...", #46: "<_bott_> Try restarting apache. not a...", and #52: "<dbs> MARC is not machine readable." |
15:37 |
csharp |
in our case all the SIP vendor agreements are between the individual libraries and the vendor |
15:37 |
jeff |
jeffdavis: currently we do not use SIP2 for vendor authentication. I'll know more about contracts if we ever do start using SIP2 with a vendor... which (*growl*, again) is likely soon. |
15:54 |
_bott_ |
I know I'll sleep better tonight |
15:54 |
rfrasur |
I'm glad MARC isn't sentient. |
15:55 |
* rfrasur |
would feel kinda bad. |
16:01 |
Dyrcona |
MARC is great for transmitting data on tape that will be printed onto card stock, not much use for anything else. |
16:01 |
dbs |
@quote search MARC |
16:01 |
pinesol_green |
dbs: 4 found: #38: "<jcamins> At least your MARC frameworks aren't...", #46: "<_bott_> I am not a cataloger, but I speak...", #52: "<dbs> MARC is not machine readable.", and #75: "< _bott_> I fear that MARC was a humorous..." |
16:03 |
rfrasur |
ooo, Dyrcona. That sounds like it'd be a great way to catalog our materials. I can see it now. The coolest wooden cabinets with cute little draws that you can pull out and set right on a table. It'd be like browsing the shelves, but you could sit down. You could have different ones, too. Subject catalogs and title catalogs...and author. Just imagine the possibilities. |
16:26 |
pinesol_green |
[evergreen|Dan Scott] Fedora needs Locale::Maketext::Lexicon too - <http://git.evergreen-ils.org/?p=Evergreen.git;a=commit;h=38fcad9> |
16:26 |
Dyrcona |
Right. I'll find a record with the 780 $t and 0 0 for indicators, then request it via feed. |
16:26 |
Dyrcona |
Thanks, guys. |
16:27 |
Dyrcona |
That is easier than what I was thinking of doing, which was find a record, copy its marc, and run that through xslt on the command line. |
16:27 |
dbs |
looking at MARC21MODS33 xsl suggests that it will be creating titleInfo/title elements for 780$t |
16:28 |
dbwells |
Dyrcona: my quick research suggests the MODS field is "<relatedItem type="preceding">" with child tags of <titleInfo><title>. |
16:28 |
Dyrcona |
dbwells++ |
16:29 |
Dyrcona |
My XSL perusal was heading that way, but I was confusing myself as to the order of the XML tags on output. |
16:29 |
dbs |
but only within the relatedItem... what dbwells said |
16:31 |
Dyrcona |
That does not appear to be part of any of the existing title indexes. |
16:32 |
Dyrcona |
The question is: Would it be better to use the MODS or a XPATH query on the MARC if I want to make a metabib_field for this? |
16:43 |
dbwells |
Dyrcona: IMHO, if we are talking about straight extraction of a single subfield, it probably doesn't matter too much either way. The MODS XPATH would probably be a little more readable to anyone who sees your config, unless they happen to know what 780|t is. |
16:44 |
|
ktomita joined #evergreen |
16:44 |
|
fparks joined #evergreen |
16:44 |
|
sseng joined #evergreen |
16:45 |
Dyrcona |
dbwells: Thank you for your opinion. I think I'll go with MARC, since the MODS extract doesn't appear to take indicators into account. |
16:50 |
|
smyers_ joined #evergreen |
16:51 |
|
jwoodard joined #evergreen |
17:05 |
|
mdriscoll1 left #evergreen |
17:13 |
|
mmorgan left #evergreen |
17:19 |
|
dcook joined #evergreen |
17:19 |
Bmagic |
I am having a heck of a time getting marc records into a perl program and then written back to disk without losing special characters |
17:20 |
Dyrcona |
Bmagic: What is the source of the MARC records and what encoding are they in: UTF8 or MARC8? |
17:21 |
Bmagic |
Dyrcona: I would love to know how to tell but I think they are utf8 |
17:22 |
Dyrcona |
Bmagic: You look at the leader of one of the records, if position 09 is the letter 'a', then that record is UTF8. |
17:22 |
Bmagic |
they are blank |
17:22 |
Dyrcona |
If it is blank, it is supposed to be MARC-8. |
17:22 |
Bmagic |
wait, they are utf8 |
17:22 |
dbs |
That is a common problem with MARC records :/ |
17:23 |
Bmagic |
LDR 00751nam 22002057 4500 |
17:23 |
Bmagic |
The "a" in "nam" is what you mean? |
17:23 |
Dyrcona |
Well, that says it is MARC-8. |
17:23 |
Dyrcona |
No, that is the MARC type. |
17:24 |
Bmagic |
ok then, they are marc8, I use this code |
17:24 |
Dyrcona |
The a would be after the m and before the 2. |
17:24 |
Bmagic |
my $file = MARC::File::USMARC->in( $inputFile ); |
17:26 |
Bmagic |
Diol{acute}e becomes Diolâe |
17:26 |
|
dconnor joined #evergreen |
17:27 |
Bmagic |
Francisco V{esc}b4{esc}sasquez becomes Francisco Vb4sasquez |
17:27 |
Dyrcona |
That might be the binmode, or your records are not really MARC-8. |
17:27 |
Dyrcona |
As dbs said: "A common problem with MARC records." |
17:27 |
Bmagic |
so, what other options can I expierement with on the binmode output? |
17:28 |
Dyrcona |
Bmagic: I suggest reading this: http://search.cpan.org/~gmcharlt/MARC-Charset-1.35/lib/MARC/Charset.pm |
17:28 |
Dyrcona |
Try the examples in that to convert to UTF-8. |
17:28 |
Bmagic |
Dyrcona: oh yeah, that |
17:28 |
Dyrcona |
I'd also output to MARCXML so the output is human readable, just to check that the output makes sense. |
11:56 |
|
frank_____ joined #evergreen |
11:58 |
|
RoganH joined #evergreen |
12:03 |
frank_____ |
hi all, I have a question, How Can I insert html code directly to a marc code template?, because I added it but, I didn´t work, I want to add an <a href> tag |
12:04 |
Dyrcona |
frank_____: You can't, or at least you shouldn't. MARC is MARC and HTML is HTML. HTML has no meaning in MARC. |
12:04 |
Dyrcona |
frank_____: What is your ultimate goal? There is very likely a way to do it. |
12:05 |
rfrasur |
Dyrcona: you beat me to it. |
12:06 |
|
kollex left #evergreen |
12:06 |
frank_____ |
hi Dyrcona thanks for response, the circulation people want to add an URL in a field that isn´t 856, so I guess the unique form to do that is insert html code when they are cataloging |
12:07 |
Dyrcona |
frank_____: It won't work, and should never work. URLs go in 856, so sayeth the LoC. |
12:08 |
Dyrcona |
You could rig up a 9xx tag with a subfield that contains a URL and the link text and then make that display by editing your TPAC templates, etc. |
12:08 |
Dyrcona |
However, you should never put html directly into a MARC record. You are likely to make the parser very unhappy. |
12:10 |
frank_____ |
ok, I will try to explain it to the cat staff, I don´t know why they are trying to do that |
12:11 |
Dyrcona |
frank_____: They're librarians.... ;) |
12:11 |
frank_____ |
yes, you are right, thanks for your help, regards |
10:29 |
paxed |
the forward slash is due to cataloguing rules, not part of the title itself. |
10:30 |
kmlussier |
Yes, I can see a distinction between the forward slash and the other punctuation examples. Those puntuation marks are really part of the title. The forward slash is not. |
10:30 |
kmlussier |
Though, I've found we have many records with a period that doesn't belong there. |
10:31 |
Dyrcona |
I'm not even sure its the rules so much as cruft left over from earlier style with MARC records. |
10:31 |
* Dyrcona |
throws his hands up and mumbles something about MARC being synonymous with garbage. |
10:32 |
dbs |
I'm sure patrons would be just as upset if "^Decade$" did not return "Decade." because "obviously" the period should not be signficant |
10:32 |
Dyrcona |
What's obvious to one person isn't to someone else, unfortunately.... |
10:32 |
dbs |
Dyrcona: exactly |
10:33 |
bshum |
Hmm |
10:33 |
* bshum |
goes fishing |
10:33 |
dbs |
So, we need an exact search that runs against a more normalized version of metabib.full_record.value that strips out ... something ... |
10:33 |
Dyrcona |
pinesol_green must have tried to ingest some MARC. |
10:34 |
bshum |
Weird... |
10:34 |
dbs |
Don't we have one page per release with release dates & end-of-life and stuff like that? |
10:35 |
dbs |
I guess not. |
10:37 |
Dyrcona |
But, hey, my brain stopped working months ago. |
10:39 |
dbs |
paxed: sure. I guess my concern with munging the database is that a change there means reingesting records, whereas if a site doesn't like the TPAC display, they just have to tweak misc_util.tt2 |
10:41 |
paxed |
kmlussier: re. periods where they don't belong, i guess it would be possible to weed out those titles that end in a period and do not have (a subfield that belongs to the title) following the title subfield |
10:42 |
Dyrcona |
Can we just burn MARC already, and AACR2, and RDA, and ....? |
10:43 |
paxed |
dbs: that MARC is stupid and has stupid workarounds that put extra characters where they are not part of the actual text is still something patrons shouldn't need to know or care about. |
10:46 |
dbs |
paxed: You say that like I'm arguing that patrons should be required to take a MARC class before touching the catalogue or something. |
10:47 |
collum |
I'm not a cataloger either, but isn't there typically a space before that slash, isn't there other punctuation that is used in MARC, as well? |
08:46 |
bshum |
Dyrcona++ |
08:47 |
bshum |
I suspect this will be useful for our quarterly exports for autographics too (state ILL stuff) |
08:47 |
Dyrcona |
I think I'll try another dump as XML to see if the size limit applies there, too. |
08:47 |
Dyrcona |
EBSCO want binary MARC, so I guess I'll need to tell them that our four most popular bibs, i.e. with the most copies, don't export in that format. |
08:47 |
|
mrpeters joined #evergreen |
08:50 |
Dyrcona |
Hey! Fun. US Gov't shutdown affects syrup users, and anyone looking up MARC information online: http://www.loc.gov/home2/shutdown-message.html |
08:50 |
Dyrcona |
Syrup apparently looks stuff up on the LOC website. |
08:51 |
Dyrcona |
We don't use Syrup, I just heard about it from Syrup users on a mailing list. |
08:53 |
Dyrcona |
Bet their z39.50 is off line, too, and I bet we get a Launchpad bug or an internal RT ticket about it. |
11:57 |
|
smyers_ joined #evergreen |
12:03 |
Dyrcona |
bshum++ # for testing before I get to ti. |
12:03 |
Dyrcona |
it |
12:07 |
Dyrcona |
bshum: We might be able to trap the MARC::File error with an eval and print the bib id. |
12:08 |
* Dyrcona |
was away from his desk for the past hour or more, chatting with a coworker about Evergreen issues and our new member library's first day. |
12:09 |
rfrasur |
Dyrcona: stupid question that I should know so apologies. Are you migrating a library? Or were you? How's it going? I know you have new ones...just not sure HOW new. |
12:09 |
Dyrcona |
rfrasur: Yes, we added the Groton Public Library http://www.gpl.org/ this past weekend. yesterday was their first day live on our Evergreen system. |