Discussion:
sendbackup: index tee cannot write [Broken pipe], Why?
John Grover
2003-11-03 15:21:50 UTC
Permalink
Amanda Users List:

I've been getting the following error consistently on this client/disk
despite my best efforts to solve it. I've changed the timeout values on
the server, removed extra files from the file system and double checked
permissions and port numbers. The /var filesystem on this same client
backs up fine using the same configuration.

I'm guessing that the timeout is caused by the failure of tar to write
to the broken pipe, but the logs don't give me much to go on. I've
scoured every lat resource I can find and still get these results. I'd
be happy to send logs to anyone who would care to help me out with
this.

Thanks,
John Grover



/-- MyClient.my.d / lev 0 FAILED [data timeout]
sendbackup: start [MyClient.my.domain.name.here:/ level 0]
sendbackup: info BACKUP=/usr/bin/tar
sendbackup: info RECOVER_CMD=/usr/bin/tar -f... -
sendbackup: info end
? sendbackup: index tee cannot write [Broken pipe]
? index returned 1
sendbackup: error [/usr/bin/tar got signal 13]
\--------

Log excerpt:
...
WARNING planner Last full dump of MyClient.my.domain.name.here:/ on
tape overwritten in 1 run.
...
FAIL dumper MyClient.my.domain.name.here / 20031101 0 [data timeout]
sendbackup: start [MyClient.my.domain.name.here:/ level 0]
sendbackup: info BACKUP=/usr/bin/tar
sendbackup: info RECOVER_CMD=/usr/bin/tar -f... -
sendbackup: info end
? gtar: Cannot add file
./usr/local/blackboard/docs/lib/session/41394: No such file or
directory
? sendbackup: index tee cannot write [Broken pipe]
? index returned 1
sendbackup: error [/usr/bin/tar got signal 13]
SUCCESS dumper MyClient.my.domain.name.here /var 20031101 1 [sec 14.720
kb 2500 kps 169.8 orig-kb 2500]
SUCCESS taper MyClient.my.domain.name.here /var 20031101 1 [sec 2.310
kb 2500 kps 1081.8 {wr: writers 80 rdwait 0.000 wrwait 0.199 filemark
2.107}]
...
Paul Bijnens
2003-11-03 16:17:26 UTC
Permalink
Post by John Grover
I'm guessing that the timeout is caused by the failure of tar to write
to the broken pipe, but the logs don't give me much to go on. I've
scoured every lat resource I can find and still get these results. I'd
be happy to send logs to anyone who would care to help me out with
this.
? sendbackup: index tee cannot write [Broken pipe]
? index returned 1
sendbackup: error [/usr/bin/tar got signal 13]
This problem is signalled now and then on this list. People give lots of
tips to help/investigate/fix, but nobody ever came back and told what
he/she did to solve it. If it ever got solved at all.

It seems it is the index-pipe on the server that is failing with
an (unknown) error. This triggers a broken index pipe on the client.
We're not sure about that either. That's why I suggested to
(temporarily) run without index for this DLE, and see if it gets any
better.

If this works fine, they you could try to create the index manually,
and see what's wrong, using commands on the server:
amrestore -p ... | gtar -tf - > /tmp/the_index
Then with "gtar" running on the client:
ssh -l amanda amanda_server amrestore -p ... | gtar -tf - > /tmp/x

If you gzip the resulting index file and put in the correct place, with
the correct name, you have an index for amrecover too.


Also verify the gnutar version (on the client!); best is 1.13.25
(1.13.19 is probably ok too, but if you have it, I advice to upgrade
anyway). All other versions are suspicious.
Maybe the problem is with gzip (index files are ALWAYS compressed
--best). Verify that version too.
--
Paul Bijnens, Xplanation Tel +32 16 397.511
Technologielaan 21 bus 2, B-3001 Leuven, BELGIUM Fax +32 16 397.512
http://www.xplanation.com/ email: ***@xplanation.com
***********************************************************************
* I think I've got the hang of it now: exit, ^D, ^C, ^\, ^Z, ^Q, F6, *
* quit, ZZ, :q, :q!, M-Z, ^X^C, logoff, logout, close, bye, /bye, *
* stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt, abort, hangup, *
* PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e, kill -1 $$, shutdown, *
* kill -9 1, Alt-F4, Ctrl-Alt-Del, AltGr-NumLock, Stop-A, ... *
* ... "Are you sure?" ... YES ... Phew ... I'm out *
***********************************************************************
Martinez, Michael
2003-11-05 14:26:34 UTC
Permalink
I had this problem and fixed it by specifying "holding-disk -1 local" in
disklist for the partition holding ~amanda on the tape server, and
specifying "-1 local" for the rest of the tape server partitions.

Regards,

Michael Martinez
ISTM/CSREES
United States Department of Agriculture
---
This email is signed with my digital signature so that you may verify
the authenticity of the sender.

--> -----Original Message-----
--> From: Paul Bijnens [mailto:***@xplanation.com]
--> Sent: Monday, November 03, 2003 11:17 AM
--> To: John Grover
--> Cc: amanda-***@amanda.org
--> Subject: Re: sendbackup: index tee cannot write [Broken pipe], Why?
-->
-->
--> John Grover wrote:
-->
--> > I'm guessing that the timeout is caused by the failure of
--> tar to write
--> > to the broken pipe, but the logs don't give me much to go on. I've
--> > scoured every lat resource I can find and still get these
--> results. I'd
--> > be happy to send logs to anyone who would care to help me out with
--> > this.
-->
-->
--> > ? sendbackup: index tee cannot write [Broken pipe]
--> > ? index returned 1
--> > sendbackup: error [/usr/bin/tar got signal 13]
-->
--> This problem is signalled now and then on this list. People
--> give lots of
--> tips to help/investigate/fix, but nobody ever came back and
--> told what
--> he/she did to solve it. If it ever got solved at all.
-->
--> It seems it is the index-pipe on the server that is failing with
--> an (unknown) error. This triggers a broken index pipe on
--> the client.
--> We're not sure about that either. That's why I suggested to
--> (temporarily) run without index for this DLE, and see if it gets any
--> better.
-->
--> If this works fine, they you could try to create the index manually,
--> and see what's wrong, using commands on the server:
--> amrestore -p ... | gtar -tf - > /tmp/the_index
--> Then with "gtar" running on the client:
--> ssh -l amanda amanda_server amrestore -p ... | gtar -tf
--> - > /tmp/x
-->
--> If you gzip the resulting index file and put in the correct
--> place, with
--> the correct name, you have an index for amrecover too.
-->
-->
--> Also verify the gnutar version (on the client!); best is 1.13.25
--> (1.13.19 is probably ok too, but if you have it, I advice
--> to upgrade
--> anyway). All other versions are suspicious.
--> Maybe the problem is with gzip (index files are ALWAYS compressed
--> --best). Verify that version too.
-->
-->
--> --
--> Paul Bijnens, Xplanation Tel
--> +32 16 397.511
--> Technologielaan 21 bus 2, B-3001 Leuven, BELGIUM Fax
--> +32 16 397.512
--> http://www.xplanation.com/ email:
--> ***@xplanation.com
--> ************************************************************
--> ***********
--> * I think I've got the hang of it now: exit, ^D, ^C, ^\,
--> ^Z, ^Q, F6, *
--> * quit, ZZ, :q, :q!, M-Z, ^X^C, logoff, logout, close,
--> bye, /bye, *
--> * stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt,
--> abort, hangup, *
--> * PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e, kill -1 $$,
--> shutdown, *
--> * kill -9 1, Alt-F4, Ctrl-Alt-Del, AltGr-NumLock,
--> Stop-A, ... *
--> * ... "Are you sure?" ... YES ... Phew ... I'm
--> out *
--> ************************************************************
--> ***********
-->
-->
-->
Paul Bijnens
2003-11-05 15:18:50 UTC
Permalink
Post by Martinez, Michael
I had this problem and fixed it by specifying "holding-disk -1 local" in
disklist for the partition holding ~amanda on the tape server, and
specifying "-1 local" for the rest of the tape server partitions.
-->
--> > ? sendbackup: index tee cannot write [Broken pipe]
--> > ? index returned 1
--> > sendbackup: error [/usr/bin/tar got signal 13]
Clarifying -- I guess you had DLE's like:

host.domain /amanda holding-disk -1 local
host.domain / comp-user-tar -1 local
host.domain /home comp-user-tar -1 local

And with "comp-user-tar" instead of "holding-disk" on /amanda you
would get the above error message.

"Holding-disk" results in backup to tape immediatly, bypassing
the holdingdisk. Any idea how that could influence the index tee?
Did the ~amanda also contain the holdingdisk? If yes, then it's
obvious you had to specify it, otherwise tar could create a huge
(infinite) backup image. Eventually you would run out of holdingdisk
space.

Could it be triggered by the following sequence:
Tar is reading some huge files, maybe compressing them too. This
takes a long time; in the meanwhile the "index" tee times out
in the server because the accompying index is only a few lines, and
is still buffered in the client. But AFAIK there is no timeout
on the index-tee, see dumper.c, line 1260 etc.
Or there you should find some errors about "dup2", just before execlp
the "gzip --best" for the index writer, or the gzip --best
on the server crashed (how would you find out about this? there is
no shell to log such a message). It could also be an output
error in "gzip --best" because your disk got full. But then
you should have found an message in the logs (or maybe not, because
your disk was full :-) ).

Just trying to understand -- maybe we found some obscure bug.
--
Paul Bijnens, Xplanation Tel +32 16 397.511
Technologielaan 21 bus 2, B-3001 Leuven, BELGIUM Fax +32 16 397.512
http://www.xplanation.com/ email: ***@xplanation.com
***********************************************************************
* I think I've got the hang of it now: exit, ^D, ^C, ^\, ^Z, ^Q, F6, *
* quit, ZZ, :q, :q!, M-Z, ^X^C, logoff, logout, close, bye, /bye, *
* stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt, abort, hangup, *
* PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e, kill -1 $$, shutdown, *
* kill -9 1, Alt-F4, Ctrl-Alt-Del, AltGr-NumLock, Stop-A, ... *
* ... "Are you sure?" ... YES ... Phew ... I'm out *
***********************************************************************
Martinez, Michael
2003-11-05 19:25:16 UTC
Permalink
Wish I knew why it fixed it. In fact, when I added the holding-disk
stuff to disklist, it wasn't because I was trying to fix the index tee
problem, per se, it was simply because I noticed I had forgotten it.

Then, subsequently noticed I had no more "broken tee" errors ...

Regards,

Michael Martinez
ISTM/CSREES
United States Department of Agriculture
---
This email is signed with my digital signature so that you may verify
the authenticity of the sender.

--> -----Original Message-----
--> From: Paul Bijnens [mailto:***@xplanation.com]
--> Sent: Wednesday, November 05, 2003 10:19 AM
--> To: Martinez, Michael
--> Cc: amanda-***@amanda.org
--> Subject: Re: sendbackup: index tee cannot write [Broken pipe], Why?
-->
-->
--> Martinez, Michael wrote:
-->
--> > I had this problem and fixed it by specifying
--> "holding-disk -1 local" in
--> > disklist for the partition holding ~amanda on the tape server, and
--> > specifying "-1 local" for the rest of the tape server partitions.
--> >
--> > --> John Grover wrote:
--> > -->
--> > --> > ? sendbackup: index tee cannot write [Broken pipe]
--> > --> > ? index returned 1
--> > --> > sendbackup: error [/usr/bin/tar got signal 13]
-->
-->
--> Clarifying -- I guess you had DLE's like:
-->
--> host.domain /amanda holding-disk -1 local
--> host.domain / comp-user-tar -1 local
--> host.domain /home comp-user-tar -1 local
-->
--> And with "comp-user-tar" instead of "holding-disk" on /amanda you
--> would get the above error message.
-->
--> "Holding-disk" results in backup to tape immediatly, bypassing
--> the holdingdisk. Any idea how that could influence the index tee?
--> Did the ~amanda also contain the holdingdisk? If yes, then it's
--> obvious you had to specify it, otherwise tar could create a huge
--> (infinite) backup image. Eventually you would run out of
--> holdingdisk
--> space.
-->
--> Could it be triggered by the following sequence:
--> Tar is reading some huge files, maybe compressing them too. This
--> takes a long time; in the meanwhile the "index" tee times out
--> in the server because the accompying index is only a few lines, and
--> is still buffered in the client. But AFAIK there is no timeout
--> on the index-tee, see dumper.c, line 1260 etc.
--> Or there you should find some errors about "dup2", just
--> before execlp
--> the "gzip --best" for the index writer, or the gzip --best
--> on the server crashed (how would you find out about this? there is
--> no shell to log such a message). It could also be an output
--> error in "gzip --best" because your disk got full. But then
--> you should have found an message in the logs (or maybe not, because
--> your disk was full :-) ).
-->
--> Just trying to understand -- maybe we found some obscure bug.
-->
-->
--> --
--> Paul Bijnens, Xplanation Tel
--> +32 16 397.511
--> Technologielaan 21 bus 2, B-3001 Leuven, BELGIUM Fax
--> +32 16 397.512
--> http://www.xplanation.com/ email:
--> ***@xplanation.com
--> ************************************************************
--> ***********
--> * I think I've got the hang of it now: exit, ^D, ^C, ^\,
--> ^Z, ^Q, F6, *
--> * quit, ZZ, :q, :q!, M-Z, ^X^C, logoff, logout, close,
--> bye, /bye, *
--> * stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt,
--> abort, hangup, *
--> * PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e, kill -1 $$,
--> shutdown, *
--> * kill -9 1, Alt-F4, Ctrl-Alt-Del, AltGr-NumLock,
--> Stop-A, ... *
--> * ... "Are you sure?" ... YES ... Phew ... I'm
--> out *
--> ************************************************************
--> ***********
-->
-->
-->

Loading...