Fluent Error FAQ

Process 1928: Received signal SIGSEGV.

Running on windows
Mesh size, 12M

serial
    Error: received a fatal signal (Segmentation fault).
    Error Object: #f

parallel
    select 4 processors

    error information
    Node 0: Process 1928: Received signal SIGSEGV.

    Node 5: Process 2824: Received signal SIGSEGV.
    MPI Application rank 0 exited before MPI_Finalize() with status 2
    The fl process could not be started.

    Reason
        This is primarily a Windows issue.

        If running Fluent with -t1 or higher number of processes and leave the session for an extended period of time (2-20 hours), it receives the following message in the console:

        The fl process could not be started.

        No other information about what timed out is provided, and only the cortex process is left running. This issue becomes more significant in light of the switch from serial to -t1.
        IP interfaces on the machine.

        The problem seems to occur with the PPP Adapter Umbrella doing some periodic disconnections. We do not see connection disruptions between the cortex and host process which are connected through Local loopback: 127.0.0.1

    Resolution:

        Windows

        Type: ipconfig /all in a Command Prompt window

        If you see the PPP Adapter Umbrella connected disable it if possible.
        You can also enforce the use of local loopback in the Fluent Launcher, Environment tab by entering this value:

        FLUENT_HOST_IP=-127.0.0.1

        If using command line job submission add this value to your command line when launching FLUENT or modify the FLUENT Launcher, Environment tab as noted above

        This issue will be addressed in future releases of Fluent.

        From <https://support.ansys.com/AnsysCustomerPortal/en_us/Knowledge%20Resources/Solutions/FLUENT%20-%20SYS/2050947>

    Error: Non-conformal periodic interface with empty intersection.
    This is likely due to incorrect specification of periodic
    offset values. Please check the offset values specified
    and recreate the interface.
    Error: Couldn't intersect threads 6 and 5 (periodic faces).

    Solution: do *match control* in meshing

Received signal SIGSEGV on Linux

*** Description
UDF parallel runnning\\
job 141938\\
eror information
- hwloc 1.11.1 has encountered what looks like an error from the operating system\\
- L3 (cpuset 0x000003f0) intersects with NUMANode (P#0 cpuset 0x0000003f) without inclusion!\\
- Error occurred in topology.c line 981\\
- Node 1: Process 22250: Received signal SIGSEGV.\\
- Fatal error in one of the compute processes.\\
*** answer
chang node to NXV node
- SM node get the above error message

*** How to specify the number of nodes?

The tutorial from hpc web is as follows:

Here is an example of an Ansys job running on *64 cores* across *two nxv nodes* [fn:ansys-parallel-linux].

#+BEGIN_SRC
#!/bin/bash
#$ -cwd
#$ -j y
#$ -m bea #get notifications on job start, finish and abortion.

#$ -pe parallel 64
#$ -l h_rt=4:0:0
#$ -l infiniband=nxv
#$ -l ansys=64 #request 64 ansys licenses.

module load ansys
ansys180 -dis -np ${NSLOTS} -b < [INPUT FILE]
#+END_SRC
which command/script corresponds to “two” nodes?
What does “#$ -l ansys=64” mean?
Do I need to give a number to replace ${NSLOTS}? If do, it would duplicate with #$ -pe parallel 64

*answer*
You should only request the number of cores you want your job to run on and whether the job is smp or parallel.

On the nxv parallel nodes, your request must use all 32 cores on a single node with a minimum of 2 nodes (hence 64 cores).

If you wanted to use 3 or 4 nxv parallel nodes, your request would be "-pe parallel 96" or "-pe parallel 128" respectively.

Ansys is licensed for 128 cores on Apocrita, when running the application you can request licences via ~l ansys <number of cores>~
So if you use 64 cores, you should pass the "-l ansys=64" scheduler parameter to request 64 ansys licenses.

The ~NSLOTS~ variable is set to the number of cores requested with "-pe".
We prefer using this variable to ensure applications only use the number of cores requested.

Turbulent viscosity limited to viscosity ratio of 1e+05

** Turbulent viscosity limited to viscosity ratio of 1e+05 *** reason The possible *causes* for large turbulent viscosity ratio include: - Bad initial conditions for the turbulence quantities (k and e) - Improper turbulent boundary conditions - Skewed cells *** solution If the problem is not caused by *bad mesh*, then *the beginning of the phenomena* can usually be avoided by: -Turn off solving *turbulence equations* for the first 100-200 iterations -Turn on turbulence and continue iterations If the problem occurs *in the middle of the iteration process*, then use the following procedure: - Stop the iteration - Turn *off* all equations except the *turbulence equations* - Increase turbulence under relaxation factors (URFs) (k and e) to 1 and iterate for 20-50 iterations - *Turn back all equations* and reduce the turbulence URFs to 0.5-0.8 and then continue iterations - Repeat the above steps for several times For *faster convergence*, it might be useful to obtain an initial solution wit...

John Khoo5 November 2020 at 07:37
Your site provides quite some interesting material, so I don't understand why you've got so many spammy comments. Please delete them, and maybe attach a profile of yourself so your audience can know who you are. And maybe link up with cfd-online.com for wider coverage.

Cheers.

ANSYS

Search This Blog

Fluent Error FAQ

Process 1928: Received signal SIGSEGV.

Received signal SIGSEGV on Linux

Labels

Comments

Post a Comment

Popular posts from this blog

TUI Fluent

Turbulent viscosity limited to viscosity ratio of 1e+05