Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi all, Thanks to Morey and Mike for their advice. DRBD now syncing correctly without any stalling. Upgrading the firmware and matching driver from HP worked a treat. Thanks Jim -----Original Message----- From: drbd-user-bounces at lists.linbit.com [mailto:drbd-user-bounces at lists.linbit.com] On Behalf Of drbd-user-request at lists.linbit.com Sent: 25 November 2009 21:44 To: drbd-user at lists.linbit.com Subject: drbd-user Digest, Vol 64, Issue 42 Send drbd-user mailing list submissions to drbd-user at lists.linbit.com To subscribe or unsubscribe via the World Wide Web, visit http://lists.linbit.com/mailman/listinfo/drbd-user or, via email, send a message with subject or body 'help' to drbd-user-request at lists.linbit.com You can reach the person managing the list at drbd-user-owner at lists.linbit.com When replying, please edit your Subject line so it is more specific than "Re: Contents of drbd-user digest..." Today's Topics: 1. Re: 8.3.5 Stalling on sync (Roof, Morey R.) 2. Re: 8.3.5 Stalling on sync (Mike Lovell) (David.Livingstone at cn.ca) ---------------------------------------------------------------------- Message: 1 Date: Wed, 25 Nov 2009 14:03:14 -0700 From: "Roof, Morey R." <MRoof at admin.nmt.edu> Subject: Re: [DRBD-user] 8.3.5 Stalling on sync To: <drbd-user at lists.linbit.com> Message-ID: <C99FEB4E3BA7A84A854CE66AE19CCDE8135E2FE4 at admin.NMTADM.AD> Content-Type: text/plain; charset="iso-8859-1" Actually I have those exact cards and I'm not seeing your problem but getting those cards to work was a major pain in the rear end. I much prefer the Myricom cards but for this HP server pair I got stuck using the HP cards due to a political issue. Anyways, some of the things I found out about these cards might be of help to you. We use SuSE here but doing the same for RedHat shouldn't be much of a problem. The biggest issue is that these cards get very hot and can over heat easily if they don't have a good amount of airflow. Once they begin to overheat packets disapper and things fall apart. Since you are seeing stalls after a bit of a run I would think that you might be having an overheating issue. Also, the driver that comes with Linux kernel doesn't work very well so you need to get the HP driver and install it. HOWEVER, you absolutely must use the driver version that match the firmware version. If they are different things don't work and you can't even run the diagnostic tool. Here I'm running firmware 4.0.516 and driver 4.0.516. When I was trying to get these working I would setup long runs of netperf and iperf and see how hot I can get the cards and then run the diagnostic tool as it will tell you the temperature of the card. I have found they start to freak out at about 85C. After playing around with card position they run under load at 66C and seem to work fine with 27C ambient air temp. All in all I'm not very impressed with these cards but I got stuck using them in one place. Hope the information helps a bit, Morey ________________________________ From: drbd-user-bounces at lists.linbit.com [mailto:drbd-user-bounces at lists.linbit.com] On Behalf Of Mike Lovell Sent: Wednesday, November 25, 2009 11:45 AM To: James Larcombe Cc: drbd-user at lists.linbit.com Subject: Re: [DRBD-user] 8.3.5 Stalling on sync hrm. i thought i had heard of someone using drbd over 10 gig with netxen cards. i went looking for a few minutes and didn't find anything though. my recommendation would be try newer drivers either through compiling the drivers for you existing kernel or using a newer kernel. i don't have details on how to do that for your cards cause i have never used any 10 gig from hp or netxen. other than that, my only recommendation is new nics. good luck mike James Larcombe wrote: Hi Mike, The cards I'm using are HP NC522SFP Dual Port 10GbE Server Adapters with HP BLc 10Gb SR SFP+ Fiber Transceivers. I could try running these with 1GB Fiber cables instead of 10GB. James From: Mike Lovell [mailto:mike at dev-zero.net] Sent: 25 November 2009 16:01 To: James Larcombe Cc: drbd-user at lists.linbit.com Subject: Re: [DRBD-user] 8.3.5 Stalling on sync nothing i tried tweaking in drbd.conf worked. the only thing that did was changing the 10gig interfaces. what cards are you using? i was using ones with an intel chip. the cards that i did get it to work with were from chelsio. in my previous thread on the list, someone mentioned that they had neterion cards working. mike James Larcombe wrote: Hi Mike, Thanks for the quick response. Yes you are correct we are using 10gig fibre cards. I'm not sure we could change them though as the fibre modules used in them cost over ?400 each. Is there anything I can tweak in the drbd.conf file to get these to work. James From: Mike Lovell [mailto:mike at dev-zero.net] Sent: 24 November 2009 17:49 To: James Larcombe Cc: drbd-user at lists.linbit.com Subject: Re: [DRBD-user] 8.3.5 Stalling on sync James Larcombe wrote: Hi List, Please help. I have installed drbd 8.3.5 on Open Suse 11.1 (Kernel 2.6.27.29-0.1). I have run drbdadm create-md dbms-test on one node and create-md dbms-test2 on the other node. I then ran drbdadm up all on both nodes. I then ran drbdadm -- --overwrite-data-of-my-peer primary dbms-test on the first node and the same with dbms-test2 on the other node. They then run for a short while before stalling. I have tried older version without success and turning the sync rate down does not make any difference. Downing the resources and bringing back up starts the sync again but this then stalls quickly. I have attached /proc/drbd, /etc/drbd.conf and a section from /var/log/messages. Any pointers would be greatly appreciated. version: 8.3.5 (api:88/proto:86-91) GIT-hash: ded8cdf09b0efa1460e8ce7a72327c60ff2210fb build by root at hp-tm-40, 2009-11-24 12:21:46 0: cs:SyncSource ro:Secondary/Secondary ds:UpToDate/Inconsistent C r---- ns:160896 nr:0 dw:0 dr:160896 al:0 bm:9 lo:1 pe:0 ua:0 ap:0 ep:1 wo:b oos:926694296 [>.] sync'ed: 0.1% (905040/905132)M 4972 stalled 1: cs:SyncTarget ro:Secondary/Secondary ds:Inconsistent/UpToDate C r---- ns:0 nr:2173248 dw:2173248 dr:0 al:0 bm:132 lo:0 pe:29878 ua:0 ap:0 ep:1 wo:b oos:777971256 [>.] sync'ed: 0.3% (759736/761856)M Stalled what kind of network are you using between the two servers? this is almost the exact same behavior i had when i was trying to get drbd to work over 10gig ethernet. turned out to be something in drbd didn't like something about the 10gig cards i had. i eventually had to change my network cards. what cards are you using? 1gig? 10gig? have you tried other cards? that is where i would look. mike -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20091125/1da3abd7/a ttachment-0001.htm> ------------------------------ Message: 2 Date: Wed, 25 Nov 2009 14:27:29 -0700 From: David.Livingstone at cn.ca Subject: Re: [DRBD-user] 8.3.5 Stalling on sync (Mike Lovell) To: drbd-user at lists.linbit.com Message-ID: <OF45AFA1F2.C44E4250-ON87257679.007476DA-87257679.0075DEFE at cn.ca> Content-Type: text/plain; charset="us-ascii" I've been using HP NC510C PCIe 10 gigabit nic(netxen) since early 2009 in a drbd setup between two DL380G5. We did experience hanging issues with the card but this was related to driver versions(HP psp support packs). We ended up opening a case with HP and are currently running on an "older" version of the nx_nic driver. If you want I will send you the specifics offline. BTW I just purchased some DL380G6 with NC522SFP(with BLc copper)and will be setting them up in the New Year. > Message: 4 > Date: Wed, 25 Nov 2009 11:45:23 -0700 > From: Mike Lovell <mike at dev-zero.net> > Subject: Re: [DRBD-user] 8.3.5 Stalling on sync > To: James Larcombe <jim at roadtech.co.uk> > Cc: drbd-user at lists.linbit.com > Message-ID: <4B0D7B43.3060302 at dev-zero.net> > Content-Type: text/plain; charset="iso-8859-1"; Format="flowed" > hrm. i thought i had heard of someone using drbd over 10 gig with netxen > cards. i went looking for a few minutes and didn't find anything though. > my recommendation would be try newer drivers either through compiling > the drivers for you existing kernel or using a newer kernel. i don't > have details on how to do that for your cards cause i have never used > any 10 gig from hp or netxen. other than that, my only recommendation is > new nics. > good luck > mike > James Larcombe wrote: > > > > Hi Mike, > > > > > > > > The cards I'm using are HP NC522SFP Dual Port 10GbE Server Adapters > > with HP BLc 10Gb SR SFP+ Fiber Transceivers. I could try running these > > with 1GB Fiber cables instead of 10GB. > > > > > > > > James > > > > > > > > *From:* Mike Lovell [mailto:mike at dev-zero.net] > > *Sent:* 25 November 2009 16:01 > > *To:* James Larcombe > > *Cc:* drbd-user at lists.linbit.com > > *Subject:* Re: [DRBD-user] 8.3.5 Stalling on sync > > > > > > > > nothing i tried tweaking in drbd.conf worked. the only thing that did > > was changing the 10gig interfaces. what cards are you using? i was > > using ones with an intel chip. the cards that i did get it to work > > with were from chelsio. in my previous thread on the list, someone > > mentioned that they had neterion cards working. > > > > mike > > > > James Larcombe wrote: > > > > Hi Mike, > > > > > > > > Thanks for the quick response. Yes you are correct we are using 10gig > > fibre cards. I'm not sure we could change them though as the fibre > > modules used in them cost over ?400 each. > > > > > > > > Is there anything I can tweak in the drbd.conf file to get these to work. > > > > > > > > James > > > > > > > > *From:* Mike Lovell [mailto:mike at dev-zero.net] > > *Sent:* 24 November 2009 17:49 > > *To:* James Larcombe > > *Cc:* drbd-user at lists.linbit.com <mailto:drbd-user at lists.linbit.com> > > *Subject:* Re: [DRBD-user] 8.3.5 Stalling on sync > > > > > > > > James Larcombe wrote: > > > > Hi List, > > > > > > > > Please help. I have installed drbd 8.3.5 on Open Suse 11.1 (Kernel > > 2.6.27.29-0.1). > > > > > > > > I have run drbdadm create-md dbms-test on one node and create-md > > dbms-test2 on the other node. I then ran drbdadm up all on both nodes. > > I then ran drbdadm -- --overwrite-data-of-my-peer primary dbms-test on > > the first node and the same with dbms-test2 on the other node. They > > then run for a short while before stalling. I have tried older version > > without success and turning the sync rate down does not make any > > difference. Downing the resources and bringing back up starts the sync > > again but this then stalls quickly. > > > > > > > > I have attached /proc/drbd, /etc/drbd.conf and a section from > > /var/log/messages. Any pointers would be greatly appreciated. > > > > > > > > version: 8.3.5 (api:88/proto:86-91) > > > > GIT-hash: ded8cdf09b0efa1460e8ce7a72327c60ff2210fb build by > > root at hp-tm-40, 2009-11-24 12:21:46 > > > > 0: cs:SyncSource ro:Secondary/Secondary ds:UpToDate/Inconsistent C r---- > > > > ns:160896 nr:0 dw:0 dr:160896 al:0 bm:9 lo:1 pe:0 ua:0 ap:0 ep:1 > > wo:b oos:926694296 > > > > [>.] sync'ed: 0.1% (905040/905132)M 4972 > > > > stalled > > > > 1: cs:SyncTarget ro:Secondary/Secondary ds:Inconsistent/UpToDate C r---- > > > > ns:0 nr:2173248 dw:2173248 dr:0 al:0 bm:132 lo:0 pe:29878 ua:0 > > ap:0 ep:1 wo:b oos:777971256 > > > > [>.] sync'ed: 0.3% (759736/761856)M > > > > Stalled > > > > > > what kind of network are you using between the two servers? this is > > almost the exact same behavior i had when i was trying to get drbd to > > work over 10gig ethernet. turned out to be something in drbd didn't > > like something about the 10gig cards i had. i eventually had to change > > my network cards. what cards are you using? 1gig? 10gig? have you > > tried other cards? that is where i would look. > > > > mike > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20091125/c4b98ef7/a ttachment.htm> ------------------------------ _______________________________________________ drbd-user mailing list drbd-user at lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user End of drbd-user Digest, Vol 64, Issue 42 ***************************************** *RT IMSS Scanned* ************************************************************************* This e-mail is confidential and may be legally privileged. It is intended solely for the use of the individual(s) to whom it is addressed. Any content in this message is not necessarily a view or statement from Road Tech Computer Systems Limited but is that of the individual sender. If you are not the intended recipient, be advised that you have received this e-mail in error and that any use, dissemination, forwarding, printing, or copying of this e-mail is strictly prohibited. We use reasonable endeavours to virus scan all e-mails leaving the company but no warranty is given that this e-mail and any attachments are virus free. You should undertake your own virus checking. The right to monitor e-mail communications through our networks is reserved by us Road Tech Computer Systems Ltd. Shenley Hall, Rectory Lane, Shenley, Radlett, Hertfordshire, WD7 9AN. - VAT Registration No GB 449 3582 17 Registered in England No: 02017435, Registered Address: Charter Court, Midland Road, Hemel Hempstead, Hertfordshire, HP2 5GE. *************************************************************************