MantisBT - ParaView | ||||||||||
| View Issue Details | ||||||||||
| ID | Project | Category | View Status | Date Submitted | Last Update | |||||
| 0012720 | ParaView | (No Category) | public | 2011-11-10 21:55 | 2012-02-08 17:22 | |||||
| Reporter | Alan Scott | |||||||||
| Assigned To | Utkarsh Ayachit | |||||||||
| Priority | urgent | Severity | minor | Reproducibility | have not tried | |||||
| Status | closed | Resolution | fixed | |||||||
| Platform | OS | OS Version | ||||||||
| Product Version | 3.12 | |||||||||
| Target Version | Fixed in Version | 3.14 | ||||||||
| Project | Sandia | |||||||||
| Topic Name | 12720_cth_reads_too_much | |||||||||
| Type | crash | |||||||||
| Summary | 0012720: CTH reads file 0 for all processes | |||||||||
| Description | We suspect that large Cray clusters are serializing access to single files when multiple pvservers are trying to access these single files. As we scale into the thousands of pvservers, we believe this is becoming fatal. ParaView 3.12.0, remote server (I am using 8 processes), Linux client. Although I am sure you can replicate with any cth dataset, I am doing the following: * Make soft links (ln -s) to files spcta.0, spcta.1, spcta.2 and spcta.3 of Dave's big CTH AMR dataset (i.e., 256 files). Now, we have a 4 file subset of this dataset. * strace -o $HOME/pvserver.strace -tt -f -ff -e trace=open,close,read,write - This will create a different file for each process. Do a ls -ls on these files, the smaller ones are not of interest, the larger are from lib/paraview3.12/pvserver. We care about the larger ones. - Note that 4 of them are slightly larger than the smaller ones. We care about these larger files. Open each file in turn. Search for spcth. Notice that each file opens file 0 4 times, and then opens it's real file 2 times. As stated, we believe that these 4 opens of file 0 are fatal for Cielo and possibly other cray systems. This is a show stopper bug for Cielo going into production with expected size datasets. I will send the log files to Utkarsh and Robert from my run. I am marking this as a crash, although technically it is a hang (or a glacier - take your pick). | |||||||||
| Steps To Reproduce | ||||||||||
| Additional Information | ||||||||||
| Tags | No tags attached. | |||||||||
| Relationships |
| |||||||||
| Attached Files | ||||||||||
| Issue History | ||||||||||
| Date Modified | Username | Field | Change | |||||||
| 2011-11-10 21:55 | Alan Scott | New Issue | ||||||||
| 2011-11-11 13:40 | Utkarsh Ayachit | Assigned To | => Utkarsh Ayachit | |||||||
| 2011-11-14 10:51 | Utkarsh Ayachit | Status | backlog => todo | |||||||
| 2011-11-14 10:51 | Utkarsh Ayachit | Status | todo => active development | |||||||
| 2011-11-14 17:20 | Utkarsh Ayachit | Topic Name | => 12720_cth_reads_too_much | |||||||
| 2011-11-14 17:20 | Utkarsh Ayachit | Note Added: 0027690 | ||||||||
| 2011-11-14 17:20 | Utkarsh Ayachit | Status | active development => gatekeeper review | |||||||
| 2011-11-14 17:20 | Utkarsh Ayachit | Fixed in Version | => git-next | |||||||
| 2011-11-14 17:20 | Utkarsh Ayachit | Resolution | open => fixed | |||||||
| 2011-11-15 13:50 | Utkarsh Ayachit | Relationship added | parent of 0012729 | |||||||
| 2011-11-18 14:53 | Utkarsh Ayachit | Fixed in Version | git-next => git-master | |||||||
| 2011-11-18 14:54 | Utkarsh Ayachit | Status | gatekeeper review => customer review | |||||||
| 2011-11-18 14:54 | Utkarsh Ayachit | Note Added: 0027718 | ||||||||
| 2011-12-21 21:47 | Alan Scott | Note Added: 0027878 | ||||||||
| 2011-12-21 21:47 | Alan Scott | Status | customer review => closed | |||||||
| 2012-02-08 17:22 | Utkarsh Ayachit | Fixed in Version | git-master => 3.14 | |||||||
| Notes | |||||
|
|
|||||
|
|
||||
|
|
|||||
|
|
||||
|
|
|||||
|
|
||||