CFD中文网

    CFD中文网

    • 登录
    • 搜索
    • 最新

    停电,中断计算,续算时报错:mpirun Floating point exception,一次性计算没问题

    OpenFOAM
    3
    8
    457
    正在加载更多帖子
    • 从旧到新
    • 从新到旧
    • 最多赞同
    回复
    • 在新帖中回复
    登录后回复
    此主题已被删除。只有拥有主题管理权限的用户可以查看。
    • 五好青年
      五好青年 最后由 编辑

      1. 最近经常停电,然后续算,总是报错,提示下面内容
      2. 只要完整,一次性计算,就不会报错
      3. 如果中断计算,再接着续算,大概30%概率会遇到以下报错:
        求大佬帮忙看下,为什么断电续算会报错呢?
      Time = 14.626952879
      
      Courant Number mean: 0.0162202738735 max: 0.547412638078 velocity magnitude: 0.972694272435
      DILUPBiCG:  Solving for Ux, Initial residual = 5.51751945657e-05, Final residual = 1.21131845941e-07, No Iterations 1
      DILUPBiCG:  Solving for Uy, Initial residual = 0.00192416415543, Final residual = 2.83092826269e-06, No Iterations 1
      DILUPBiCG:  Solving for Uz, Initial residual = 0.00192162985727, Final residual = 2.27969090213e-06, No Iterations 1
      GAMGPCG:  Solving for p, Initial residual = 0.0445458617273, Final residual = 0.000312778215492, No Iterations 3
      IB time step continuity errors : sum local = 1.94733457266e-10, global = 1.30333893019e-13, cumulative = -2.85480937623e-12
      GAMGPCG:  Solving for p, Initial residual = 0.00400247451546, Final residual = 2.29650918295e-05, No Iterations 4
      IB time step continuity errors : sum local = 1.43803150508e-11, global = 2.57306076116e-13, cumulative = -2.59750330012e-12
      GAMGPCG:  Solving for p, Initial residual = 0.000264903909441, Final residual = 2.47545703992e-06, No Iterations 6
      IB time step continuity errors : sum local = 1.5492859545e-12, global = 2.31568779699e-13, cumulative = -2.36593452042e-12
      Thrust from Body Force = 0.0974865447818	Thrust from Act. Line = 0.0977889499694	Ratio = 0.996907573017
      Torque from Body Force = 0.00155780274379	Torque from Act. Line = 0.00156265463869	Ratio = 0.996895094552
      ExecutionTime = 2028.52 s  ClockTime = 2095 s
      
      fieldAverage fieldAverage1 output:
          Calculating averages
      
      Time = 14.62804371
      
      Courant Number mean: 0.0162202632119 max: 0.54826182015 velocity magnitude: 0.974203179026
      DILUPBiCG:  Solving for Ux, Initial residual = 5.51651007481e-05, Final residual = 1.21918770605e-07, No Iterations 1
      DILUPBiCG:  Solving for Uy, Initial residual = 0.00192414480744, Final residual = 2.62900058636e-06, No Iterations 1
      DILUPBiCG:  Solving for Uz, Initial residual = 0.0019217484706, Final residual = 2.60889084187e-06, No Iterations 1
      GAMGPCG:  Solving for p, Initial residual = 0.0444294506698, Final residual = 0.000310879570179, No Iterations 3
      IB time step continuity errors : sum local = 1.93563121456e-10, global = 2.67585622386e-13, cumulative = -2.09834889803e-12
      GAMGPCG:  Solving for p, Initial residual = 0.00397418566853, Final residual = 2.60870920638e-05, No Iterations 4
      IB time step continuity errors : sum local = 1.63549690983e-11, global = 2.1607547894e-13, cumulative = -1.88227341909e-12
      GAMGPCG:  Solving for p, Initial residual = 0.000265252820756, Final residual = 2.33165488111e-06, No Iterations 6
      IB time step continuity errors : sum local = 1.45553081823e-12, global = 8.3257735207e-14, cumulative = -1.79901568389e-12
      Thrust from Body Force = 0.0976809290426	Thrust from Act. Line = 0.097986534837	Ratio = 0.996881144997
      Torque from Body Force = 0.00156809160673	Torque from Act. Line = 0.00157301669264	Ratio = 0.996869018662
      triangles: 240 hit: 13039
      [dyfluid:02855] *** Process received signal ***
      [dyfluid:02855] Signal: Floating point exception (8)
      [dyfluid:02855] Signal code:  (-6)
      [dyfluid:02855] Failing at address: 0x3e800000b27
      [dyfluid:02855] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x354a0)[0x7ff3c6cfd4a0]
      [dyfluid:02855] [ 1] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x38)[0x7ff3c6cfd418]
      [dyfluid:02855] [ 2] /lib/x86_64-linux-gnu/libc.so.6(+0x354a0)[0x7ff3c6cfd4a0]
      [dyfluid:02855] [ 3] /home/dyfluid/foam/foam-extend-4.0/lib/linux64GccDPOpt/libsampling.so(_ZN4Foam16vtkSurfaceWriterIdE9writeDataERNS_7OstreamERKNS_5FieldIdEE+0x7f)[0x7ff3c993148f]
      [dyfluid:02855] [ 4] /home/dyfluid/foam/foam-extend-4.0/lib/linux64GccDPOpt/libsampling.so(_ZNK4Foam16vtkSurfaceWriterIdE5writeERKNS_8fileNameES4_RKNS_5FieldINS_6VectorIdEEEERKNS_4ListINS_4faceEEES4_RKNS5_IdEEb+0x3a1)[0x7ff3c99333c1]
      [dyfluid:02855] [ 5] /home/dyfluid/foam/foam-extend-4.0/lib/linux64GccDPOpt/libimmersedBoundary.so(_ZNK4Foam28immersedBoundaryFvPatchFieldIdE5writeERNS_7OstreamE+0x48e)[0x7ff3c949416e]
      [dyfluid:02855] [ 6] pisoIbFoam-HAT-ALMorADM(_ZNK4Foam14GeometricFieldIdNS_12fvPatchFieldENS_7volMeshEE22GeometricBoundaryField10writeEntryERKNS_4wordERNS_7OstreamE+0x114)[0x4595a4]
      [dyfluid:02855] [ 7] pisoIbFoam-HAT-ALMorADM(_ZNK4Foam14GeometricFieldIdNS_12fvPatchFieldENS_7volMeshEE9writeDataERNS_7OstreamE+0xaf)[0x45cfaf]
      [dyfluid:02855] [ 8] /home/dyfluid/foam/foam-extend-4.0/lib/linux64GccDPOpt/libfoam.so(_ZNK4Foam11regIOobject11writeObjectENS_8IOstream12streamFormatENS1_13versionNumberENS1_15compressionTypeE+0x41c)[0x7ff3c8194ffc]
      [dyfluid:02855] [ 9] /home/dyfluid/foam/foam-extend-4.0/lib/linux64GccDPOpt/libfoam.so(_ZNK4Foam14objectRegistry11writeObjectENS_8IOstream12streamFormatENS1_13versionNumberENS1_15compressionTypeE+0x1db)[0x7ff3c8198cdb]
      [dyfluid:02855] [10] /home/dyfluid/foam/foam-extend-4.0/lib/linux64GccDPOpt/libfoam.so(_ZNK4Foam14objectRegistry11writeObjectENS_8IOstream12streamFormatENS1_13versionNumberENS1_15compressionTypeE+0x1db)[0x7ff3c8198cdb]
      [dyfluid:02855] [11] /home/dyfluid/foam/foam-extend-4.0/lib/linux64GccDPOpt/libfoam.so(_ZNK4Foam4Time11writeObjectENS_8IOstream12streamFormatENS1_13versionNumberENS1_15compressionTypeE+0x542)[0x7ff3c81b5212]
      [dyfluid:02855] [12] pisoIbFoam-HAT-ALMorADM[0x421795]
      [dyfluid:02855] [13] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7ff3c6ce8830]
      [dyfluid:02855] [14] pisoIbFoam-HAT-ALMorADM[0x424969]
      [dyfluid:02855] *** End of error message ***
      --------------------------------------------------------------------------
      mpirun noticed that process rank 7 with PID 2855 on node dyfluid exited on signal 8 (Floating point exception).
      --------------------------------------------------------------------------
      
      

      I am a CFD machine with no emotions. Welcome to browse my Bilibili, search: seeeeeeeeeeer

      1 条回复 最后回复 回复 引用
      • 李东岳
        李东岳 管理员 最后由 编辑

        可以看下controlDict里面的写入精度,提高一下。有可能是重新启动计算之后,最后一个时间步的写入结果精度不够高,导致后面发散了

        CFD高性能服务器 http://dyfluid.com/servers.html

        五好青年 1 条回复 最后回复 回复 引用
        • 五好青年
          五好青年 @李东岳 最后由 编辑

          @李东岳
          东岳老师,我的精度是12,请问这个一般够么?感谢

            writeFormat       binary;
          
            writePrecision    12;
          
            writeCompression  compressed;
          
            timeFormat        general;
          
            timePrecision     12;
          

          I am a CFD machine with no emotions. Welcome to browse my Bilibili, search: seeeeeeeeeeer

          李东岳 1 条回复 最后回复 回复 引用
          • 李东岳
            李东岳 管理员 @五好青年 最后由 编辑

            @五好青年 够。那其他的不清楚为啥了。

            另外,为啥老断电,这个很虚啊

            CFD高性能服务器 http://dyfluid.com/servers.html

            五好青年 1 条回复 最后回复 回复 引用
            • 五好青年
              五好青年 @李东岳 最后由 编辑

              @李东岳
              东岳老师,我仔细核对了:成功续算的case+报错浮点溢出的case
              发现报错均出现在第一次结果写出的位置

              我是间隔20个时间步保存一次,就是在第20步写入数据的时候,提示上面的报错

              I am a CFD machine with no emotions. Welcome to browse my Bilibili, search: seeeeeeeeeeer

              李东岳 1 条回复 最后回复 回复 引用
              • 李东岳
                李东岳 管理员 @五好青年 最后由 编辑

                @五好青年 写文件就出错,不写文件呢?如果30步一写呢

                CFD高性能服务器 http://dyfluid.com/servers.html

                五好青年 1 条回复 最后回复 回复 引用
                • 五好青年
                  五好青年 @李东岳 最后由 编辑

                  @李东岳

                  1. 时间精度,修改为6,直接无法读取上一步结果
                  2. 时间精度,修改为20,同样到首次写入结果的位置就报错
                  3. 20步写入结果,更改为1步写入结果,30步写入结果。续算时,都在首次写入结果时报错,而且报错内容相同,如提问中的浮点溢出

                  I am a CFD machine with no emotions. Welcome to browse my Bilibili, search: seeeeeeeeeeer

                  T 1 条回复 最后回复 回复 引用
                  • T
                    Tens 讲师 @五好青年 最后由 编辑

                    @五好青年 是自己改的程序吗,我之前好像遇到过。
                    可能是某个变量在计算过程中随着时间变化但没有保存,重新开始计算后这个变量被初始化,初始化的值和当前时刻的差很多,导致发散

                    1 条回复 最后回复 回复 引用
                    • First post
                      Last post