Quantcast
Channel: Intel Communities : Unanswered Discussions - Servers
Viewing all articles
Browse latest Browse all 2146

CentOS 6.6 Intel SRCSAS18E

$
0
0

I have been running an intel mobo with hardware RAID and 6 x 1TB Drives for about four years.  The server stopped booting a few weeks ago and I am trying to resurect it and if possible salvage the data. It was not even showing up in POST cycle as a controller, so I moved it to another board I had around and it shows up, but still not working correctly.    The original system was running CentOS 5.x and controller moved to new system running CENTOS 6.6

 

Issue:  RAID SRCSAS18E RAID 5 Volume not showing up.  Cannot do "ctrl+G" from POST stage (it pauses for about 45seconds during post).  OS driver installed as well as CLI and web tool but controller does not show up to web tool, and when I issue command to re-flash (likly it would be an upgrade as I don't recall last time I did a firmware flash on it) it does not give me very useful response.

 

 

 

Hardware:

[root@titan1 ~]# lspci

00:00.0 Host bridge: NVIDIA Corporation C55 Host Bridge (rev a2)

00:00.1 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)

00:00.2 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)

00:00.3 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)

00:00.4 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)

00:00.5 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a2)

00:00.6 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)

00:00.7 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)

00:01.0 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)

00:01.1 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)

00:01.2 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)

00:01.3 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)

00:01.4 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)

00:01.5 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)

00:01.6 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)

00:02.0 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)

00:02.1 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)

00:02.2 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)

00:03.0 PCI bridge: NVIDIA Corporation C55 PCI Express bridge (rev a1)

00:06.0 PCI bridge: NVIDIA Corporation C55 PCI Express bridge (rev a1)

00:07.0 PCI bridge: NVIDIA Corporation C55 PCI Express bridge (rev a1)

00:09.0 RAM memory: NVIDIA Corporation MCP51 Host Bridge (rev a2)

00:0a.0 ISA bridge: NVIDIA Corporation MCP51 LPC Bridge (rev a3)

00:0a.1 SMBus: NVIDIA Corporation MCP51 SMBus (rev a3)

00:0a.2 RAM memory: NVIDIA Corporation MCP51 Memory Controller 0 (rev a3)

00:0b.0 USB controller: NVIDIA Corporation MCP51 USB Controller (rev a3)

00:0b.1 USB controller: NVIDIA Corporation MCP51 USB Controller (rev a3)

00:0d.0 IDE interface: NVIDIA Corporation MCP51 IDE (rev a1)

00:0e.0 RAID bus controller: NVIDIA Corporation MCP51 Serial ATA Controller (rev a1)

00:0f.0 RAID bus controller: NVIDIA Corporation MCP51 Serial ATA Controller (rev a1)

00:10.0 PCI bridge: NVIDIA Corporation MCP51 PCI Bridge (rev a2)

00:10.1 Audio device: NVIDIA Corporation MCP51 High Definition Audio (rev a2)

00:14.0 Bridge: NVIDIA Corporation MCP51 Ethernet Controller (rev a3)

01:00.0 PCI bridge: Intel Corporation 80333 Segment-A PCI Express-to-PCI Express Bridge

01:00.2 PCI bridge: Intel Corporation 80333 Segment-B PCI Express-to-PCI Express Bridge

02:0e.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 1068

06:08.0 VGA compatible controller: Matrox Electronics Systems Ltd. MGA 1064SG [Mystique] (rev 02)

06:0a.0 Ethernet controller: Intel Corporation 82544EI Gigabit Ethernet Controller (Copper) (rev 02)

[root@titan1 ~]# lspci -vv

<snip>

02:0e.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 1068

        Subsystem: Intel Corporation RAID Controller SRCSAS18E

        Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping+ SERR- FastB2B- DisINTx-

        Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-

        Interrupt: pin A routed to IRQ 16

        Region 0: Memory at cfef0000 (32-bit, prefetchable) [size=64K]

        Region 2: Memory at cfdc0000 (32-bit, non-prefetchable) [size=128K]

        [virtual] Expansion ROM at cfe00000 [disabled] [size=32K]

        Capabilities: [c0] Power Management version 2

                Flags: PMEClk- DSI- D1+ D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)

                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-

        Capabilities: [d0] MSI: Enable- Count=1/2 Maskable- 64bit+

                Address: 0000000000000000  Data: 0000

        Capabilities: [e0] PCI-X non-bridge device

                Command: DPERE- ERO- RBC=512 OST=4

                Status: Dev=02:0e.0 64bit+ 133MHz+ SCD- USC- DC=bridge DMMRBC=1024 DMOST=4 DMCRS=16 RSCEM- 266MHz- 533MHz-

        Kernel modules: megaraid_sas

 

[root@titan1 CmdTool2]# dmesg |less

 

 

<snip>

scsi 2:0:0:0: Direct-Access ATA  Hitachi HTS72322 FCDO PQ: 0 ANSI: 5

ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)

ata4.00: ATA-8: WDC WD2500BEKT-60PVMT0, 01.01A01, max UDMA/133

ata4.00: 488397168 sectors, multi 16: LBA48 NCQ (depth 31/32)

ata4.00: configured for UDMA/133

scsi 3:0:0:0: Direct-Access ATA  WDC WD2500BEKT-6 01.0 PQ: 0 ANSI: 5

ata6: SATA link down (SStatus 0 SControl 300)

megasas: 06.803.01.00-rh1 Mon. Mar. 10 17:00:00 PDT 2014

megasas: 0x1000:0x0411:0x8086:0x1001: bus 2:slot 14:func 0

megaraid_sas 0000:02:0e.0: enabling device (0080 -> 0082)

ACPI: PCI Interrupt Link [AXV7] enabled at IRQ 16

  alloc irq_desc for 16 on node -1

  alloc kstat_irqs on node -1

megaraid_sas 0000:02:0e.0: PCI INT A -> Link[AXV7] -> GSI 16 (level, low) -> IRQ 16

megasas: Waiting for FW to come to ready state

sr0: scsi3-mmc drive: 40x/40x writer cd/rw xa/form2 cdda tray

Uniform CD-ROM driver Revision: 3.20

sr 0:0:1:0: Attached scsi CD-ROM sr0

STARTING CRC_T10DIF

sd 2:0:0:0: [sda] 488397168 512-byte logical blocks: (250 GB/232 GiB)

sd 3:0:0:0: [sdb] 488397168 512-byte logical blocks: (250 GB/232 GiB)

sd 2:0:0:0: [sda] Write Protect is off

sd 2:0:0:0: [sda] Mode Sense: 00 3a 00 00

sd 2:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

sd 3:0:0:0: [sdb] Write Protect is off

sd 3:0:0:0: [sdb] Mode Sense: 00 3a 00 00

sd 3:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

sda:

sdb: sdb1 sdb2

sd 3:0:0:0: [sdb] Attached SCSI disk

sda1 sda2

sd 2:0:0:0: [sda] Attached SCSI disk

INFO: task modprobe:351 blocked for more than 120 seconds.

      Not tainted 2.6.32-504.3.3.el6.x86_64 #1

"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

modprobe      D 0000000000000000     0   351      1 0x00000000

ffff880237281eb8 0000000000000082 0000000000000000 0000000000000001

ffff880237281e18 ffffffff8115c876 0000002a3dab586f ffffffff810bfcff

ffff880237281e48 00000000fffe3007 ffff88023725baf8 ffff880237281fd8

Call Trace:

[<ffffffff8115c876>] ? vfree+0x36/0x80

[<ffffffff810bfcff>] ? load_module+0x1abf/0x1cd0

[<ffffffffa003d000>] ? wait_scan_init+0x0/0xd [scsi_wait_scan]

[<ffffffff8136cbe5>] wait_for_device_probe+0x55/0x90

[<ffffffff8109eb00>] ? autoremove_wake_function+0x0/0x40

[<ffffffff810a5155>] ? __blocking_notifier_call_chain+0x65/0x80

[<ffffffffa003d009>] wait_scan_init+0x9/0xd [scsi_wait_scan]

[<ffffffff8100204c>] do_one_initcall+0x3c/0x1d0

[<ffffffff810bfff1>] sys_init_module+0xe1/0x250

[<ffffffff8100b072>] system_call_fastpath+0x16/0x1b

FW state [0] hasn't changed in 180 secs

pcidata = 30400

megaraid_sas 0000:02:0e.0: megasas: FW restarted successfully from megasas_init_fw!

megasas: Waiting for FW to come to ready state

INFO: task modprobe:351 blocked for more than 120 seconds.

      Not tainted 2.6.32-504.3.3.el6.x86_64 #1

"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

modprobe      D 0000000000000000     0   351      1 0x00000000

ffff880237281eb8 0000000000000082 0000000000000000 0000000000000001

ffff880237281e18 ffffffff8115c876 0000002a3dab586f ffffffff810bfcff

ffff880237281e48 00000000fffe3007 ffff88023725baf8 ffff880237281fd8

Call Trace:

[<ffffffff8115c876>] ? vfree+0x36/0x80

[<ffffffff810bfcff>] ? load_module+0x1abf/0x1cd0

[<ffffffffa003d000>] ? wait_scan_init+0x0/0xd [scsi_wait_scan]

[<ffffffff8136cbe5>] wait_for_device_probe+0x55/0x90

[<ffffffff8109eb00>] ? autoremove_wake_function+0x0/0x40

[<ffffffff810a5155>] ? __blocking_notifier_call_chain+0x65/0x80

[<ffffffffa003d009>] wait_scan_init+0x9/0xd [scsi_wait_scan]

[<ffffffff8100204c>] do_one_initcall+0x3c/0x1d0

[<ffffffff810bfff1>] sys_init_module+0xe1/0x250

[<ffffffff8100b072>] system_call_fastpath+0x16/0x1b

INFO: task modprobe:351 blocked for more than 120 seconds.

      Not tainted 2.6.32-504.3.3.el6.x86_64 #1

"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

modprobe      D 0000000000000000     0   351      1 0x00000000

ffff880237281eb8 0000000000000082 0000000000000000 0000000000000001

ffff880237281e18 ffffffff8115c876 0000002a3dab586f ffffffff810bfcff

ffff880237281e48 00000000fffe3007 ffff88023725baf8 ffff880237281fd8

Call Trace:

[<ffffffff8115c876>] ? vfree+0x36/0x80

[<ffffffff810bfcff>] ? load_module+0x1abf/0x1cd0

[<ffffffffa003d000>] ? wait_scan_init+0x0/0xd [scsi_wait_scan]

[<ffffffff8136cbe5>] wait_for_device_probe+0x55/0x90

[<ffffffff8109eb00>] ? autoremove_wake_function+0x0/0x40

<snip>

 

 

 

 

 

I don't think any of the above messages from kernel are any more then long time out values for drives. That is odd but ??   I can post full kernel message output if needed.....

 

 

Installed Intel RAID services / tools:

[root@titan1 RAID]# ls

ir3_1068SASHWR_FW_v1.12.280-0826_pkg-v7.0.1-0075.zip  ir3_Linux_x86_RWC2_v14.05.02.03.tar.gz  Linux_x64_RWC2_v14.08.01-04.tar.gz

ir3_CmdTool2_Linux_v8.07.16.zip                       ir3_UEFI_CmdTool2_v2.03.00.s6.zip       MR_Linux_drv_v6.705.07.00.tgz

 

[root@titan1 RAID]# cd /usr/local/RAID\ Web\ Console\ 2/

[root@titan1 RAID Web Console 2]# ./start

starthelp.sh         startmonitorhelp.sh  startupui.sh

[root@titan1 RAID Web Console 2]# ./startupui.sh

Messave above is just lack of connection from web UI to ?? agent?  What I am not clear on is what agent or service can I check for to be running such that this UI could connect to?

 

 

Attempt to flash firmware to controller:

[root@titan1 tmp]# cp /root/RAID/ir3_1068SASHWR_FW_v1.12.280-0826_pkg-v7.0.1-0075.zip .

[root@titan1 tmp]# unzip ir3_1068SASHWR_FW_v1.12.280-0826_pkg-v7.0.1-0075.zip

Archive:  ir3_1068SASHWR_FW_v1.12.280-0826_pkg-v7.0.1-0075.zip

  inflating: update.nsh

   creating: CmdTool2/DOS/

  inflating: CmdTool2/DOS/CmdTool2.exe

  inflating: CmdTool2/DOS/CMDTool2_DOS_v8.00.11_rel-notes.txt

  inflating: CmdTool2/DOS/LICENSE_DOS32A.txt

   creating: CmdTool2/Linux/

  inflating: CmdTool2/Linux/CmdTool2-8.00.13-1.i386.rpm

  inflating: CmdTool2/Linux/CMDTool2_Linux_v8.00.13_rel-notes.txt

  inflating: CmdTool2/Linux/Lib_Utils-1.00-07.noarch.rpm

  inflating: CmdTool2/Linux/Lib_Utils2-1.00-01.noarch.rpm

   creating: CmdTool2/Solaris/

  inflating: CmdTool2/Solaris/CmdTool2

  inflating: CmdTool2/Solaris/CmdTool2.pkg

  inflating: CmdTool2/Solaris/CMDTool2_Solaris_v8.00.06_rel-notes.txt

   creating: CmdTool2/UEFI/

  inflating: CmdTool2/UEFI/CmdTool2.efi

  inflating: CmdTool2/UEFI/CMDTool2_UEFI_v2.01.00.S6_rel-notes.txt

   creating: CmdTool2/Windows/

  inflating: CmdTool2/Windows/CmdTool2.exe

  inflating: CmdTool2/Windows/CmdTool2Support.zip

  inflating: CmdTool2/Windows/CMDTool2_Windows_v8.00.11_rel-notes.txt

  inflating: 68_fw826.rom

  inflating: ir3_1068SASHWR_Firmware_v1.12.280-0826_readme.txt

  inflating: License_v2.pdf

  inflating: update.bat

  inflating: 68_fw826_4MB.rom

[root@titan1 tmp]#

[root@titan1 tmp]# mv *.rom /opt/MegaRAID/CmdTool2/

[root@titan1 CmdTool2]# ./CmdTool264  -adpfwflash -f 68_fw826.rom

Invalid input at or near token 68_fw826.rom

 

Exit Code: 0x01

[root@titan1 CmdTool2]#

 

 

 

Questions:

 

1) Has anyone any experiance and futher direction on how to debug this further?

2) Does anyone have this RAID controller running CentOS / RHEL 6.6?

3) The inabiltiy to do <ctrl +G> and UEFI is not good. What I do have is a note in my system change control about something like this which noted to get by it by "disabling all other controller BIOS on motherboard" .  I would like to validate if others have this issue, and or work around for this.

 

Thanks,


Viewing all articles
Browse latest Browse all 2146

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>