ARAnyM VMs with Debian hanging at 100% CPU usage

July 25th, 2012 - 04:10 pm ET by Thorsten Glaser | Report spam
Hi everyone,

I *really* need a stand-alone testcase for the GCC __sync_* builtins
or specifically, the cmpxchg syscall. A threded one.

I’ve got ARAnyM VMs hanging at 100% CPU, no network response any more,
during builds of first qt4-x11, and now cwidget. (Although I don’t know
if the former uses those GCC primitives, if available, in their generic
atomics code, or the latter uses atomics at all, this is a first guess.)

Since the VMs hang so that I must kill them, no debugging is possible.

The last lines of the cwidget build log (although eatmydata and the
ext4 journal replay might have cut off more) were:

if g++ -DHAVE_CONFIG_H -I. -I. -I.. -I.. -Wall -Werror -I.. -I../src -D_FORTIFY_SOURCE=2 -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -D_REENTRANT -I/usr/include/sigc++-2.0 -I/usr/lib/m68k-linux-gnu/sigc++-2.0/include -MT test_threads.o -MD -MP -MF ".deps/test_threads.Tpo" -c -o test_threads.o test_threads.cc; \
then mv -f ".deps/test_threads.Tpo" ".deps/test_threads.Po"; else rm -f ".deps/test_threads.Tpo"; exit 1; fi
/bin/bash ../libtool --tag=CXX --mode=link g++ -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -D_REENTRANT -I/usr/include/sigc++-2.0 -I/usr/lib/m68k-linux-gnu/sigc++-2.0/include -Wl,-z,relro -o test main.o test_eassert.o test_ssprintf.o test_threads.o ../src/cwidget/libcwidget.la -lcppunit -lncursesw -lpthread -lsigc-2.0
libtool: link: g++ -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -D_REENTRANT -I/usr/include/sigc++-2.0 -I/usr/lib/m68k-linux-gnu/sigc++-2.0/include -Wl,-z -Wl,relro -o .libs/test main.o test_eassert.o test_ssprintf.o test_threads.o ../src/cwidget/.libs/libcwidget.so /usr/lib/libcppunit.so -lncursesw -lpthread /usr/lib/m68k-linux-gnu/libsigc-2.0.so
make[3]: Leaving directory `/tmp/buildd/cwidget-0.5.16/tests'
/usr/bin/make check-TESTS
make[3]: Entering directory `/tmp/buildd/cwidget-0.5.16/tests'

Furthermore, as already mentioned, when I tried to test whether my
backport of the GCC atomics worked, I made a small test programme
to see whether it linked, but it *also* froze the VM (and I’m of the
opinion that it at most should have made the process inside eat up
almost-all CPU and be still killable/accessible). Parallel programming
is not mine, so I don’t know whether I did anything wrong (it wasn’t
really meant to be executed but to freeze the entire OS *and* virtual
hardware isn’t what I’d expect even in an endless loop).

Host: Kubuntu 8.04 (hardy) i386, Linux 2.6.24-32-server (2.6.24-32.104),
mpfr4 3.1.0-5~wtf804+1, aranym 0.9.13-3.1~wtf804+2

Guest: Debian sid m68k, Linux 3.2.0-3-atari (3.2.21-3)

Right now I suspect a bug on any layer, including the emulation.
It may or may not happen on my “other” host system:
Debian sid amd64 Xen domU, Linux 2.6.32-5-xen-amd64 (2.6.32-41),
mpfr4 3.1.0-5, aranym 0.9.13-3.1

If really needed, I can check that (and crash that VM, too) once the
guest I have on it is done compiling 3.2.23-1, i.e. in about 4-5 days.

Thanks in advance,
//mirabilos
<Natureshadow> Dann mach ich git annex copy --to shore und fertig ist das
<Natureshadow> das ist ja viel cooler als ownCloud ...
<mirabilos> sag ich doch
<Natureshadow> ja wieso stimmt das denn immer was du sagst ...


To UNSUBSCRIBE, email to debian-68k-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: http://lists.debian.org/Pine.BSM.4....8400.18420@herc.mirbsd.org
email Follow the discussionReplies 32 repliesReplies Make a reply

Similar topics

Replies

#1 Thorsten Glaser
July 25th, 2012 - 04:30 pm ET | Report spam
Dixi quod…

during builds of first qt4-x11, and now cwidget. (Although I don’t know



Sorry, that was during compile of cmake and cwidget, not qt4-x11.

bye,
//mirabilos
13:37⎜«Natureshadow» Deep inside, I hate mirabilos. I mean, he's a good
guy. But he's always right! In every fsckin' situation, he's right. Even
with his deeply perverted taste in software and borked ambition towards
broken OSes - in the end, he's damn right about it :(! […] works in mksh


To UNSUBSCRIBE, email to
with a subject of "unsubscribe". Trouble? Contact
Archive: http://lists.debian.org/
Replies Reply to this message
#2 Thorsten Glaser
July 25th, 2012 - 05:00 pm ET | Report spam
Andreas Schwab dixit:

Why do you think aranym is freezing?



No screen output, the network interface is not accessible any more,
and the VM process eats up more CPU than it does when the guest is
100% busy (then it’s normally ~98-99%). To me, indistinguishable
from the Linux kernel busy-looping, though. I said I suspect *all*
layers, not just aranym, but I can’t not suspect it either…

bye,
//mirabilos
21:27⎜[Natureshadow] BÄH! Wer hatn das Bier neben den Notebooklüfter
⎜ gestellt ...
21:27⎜>Natureshadow< lol 21:27⎜>Natureshadow< du?
21:27⎜[Natureshadow] vermutlich ... -- Kev^WNatureshadow allein zu Haus


To UNSUBSCRIBE, email to
with a subject of "unsubscribe". Trouble? Contact
Archive: http://lists.debian.org/
Replies Reply to this message
#3 Andreas Schwab
July 25th, 2012 - 05:10 pm ET | Report spam
Thorsten Glaser writes:

(it wasn’t really meant to be executed but to freeze the entire OS
*and* virtual hardware isn’t what I’d expect even in an endless loop).



Why do you think aranym is freezing?

Andreas.

Andreas Schwab,
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."


To UNSUBSCRIBE, email to
with a subject of "unsubscribe". Trouble? Contact
Archive: http://lists.debian.org/
Replies Reply to this message
#4 Andreas Schwab
July 25th, 2012 - 05:40 pm ET | Report spam
Thorsten Glaser writes:

No screen output, the network interface is not accessible any more,



All this is controlled by the guest.

and the VM process eats up more CPU than it does when the guest is
100% busy (then it’s normally ~98-99%).



When the guest is spinning then the VM runs at full speed, of course.

Andreas.

Andreas Schwab,
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."


To UNSUBSCRIBE, email to
with a subject of "unsubscribe". Trouble? Contact
Archive: http://lists.debian.org/
Replies Reply to this message
#5 Thorsten Glaser
July 26th, 2012 - 04:40 am ET | Report spam
Dixi quod…

I’ve got ARAnyM VMs hanging at 100% CPU, no network response any more,



Uh, it gets more interesting. Please, if any ARAnyM, Linux or Debian
developer has got an idea…

I thought I just had the hang during an update. Resetted the VM, then
tried the update again. The following is a “screenshot” of the split-
screen terminal (GNU screen) at freeze point (108x33):

Reading package lists...
Building dependency tree...
Reading state information... Starting
Starting 2
Done
The following packages will be upgraded:
dpkg libc-bin libc-dev-bin libc6 libc6-dev libdb5.1 libxml2 multiarch-support
8 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
Need to get 11.5 MB of archives.
After this operation, 278 kB of additional disk space will be used.
Get:1 http://ftp.debian-ports.org/debian/ unstable/main dpkg m68k 1.16.8 [2293 kB]
Get:2 http://www.freewrt.org/~tg/debs68k/ sid/main libdb5.1 m68k 5.1.29-5 [632 kB]
Get:3 http://ftp.debian-ports.org/debian/ unstable/main libc6-dev m68k 2.13-35 [2140 kB]

0 -
:~ # cd /var/cache/pbuilder/build/cow.1393/var/l
lib/ local/ lock/ log/
:~ # cd /var/cache/pbuilder/build/cow.1393/var/lib/apt/lists/
:...der/build/cow.1393/var/lib/apt/lists # pax -rw -pe f* w* /var/lib/apt/lists/;cd;agus -f
+ apt-get update










1 -
aranym 1.65 1.07 0.55 Thu 26.07.2012 08:17 [0 -] 1 -

(Explanation: the upper tab has 'eatmydata cowbuilder --update' running;
in the lower tab I copy the APT repository lists it has downloaded into
the live system and then run 'agus -f', which for our purpose is 'set -x;
apt-get update'.)

This is the stdout+stderr of the VM:

Running Aranym on X11: :0.0
ARAnyM 0.9.13
Using config file: 'aranym.config.x11'
Could not open joystick 0
ARAnyM RTC Timer: /dev/rtc: Device or resource busy
ARAnyM LILO: Error loading ramdisk 'root.bin'
[ 0.000000] Initializing cgroup subsys cpuset
[ 0.000000] Initializing cgroup subsys cpu
[ 0.000000] Linux version 3.2.0-3-atari (Debian 3.2.21-3) () (gcc version 4.6.3 (Debian 4.6.3-7+m68k.2) ) #1 Sat Jun 30 06:00:58 UTC 2012
[ 0.000000] console [debug0] enabled
[ 0.000000] Atari hardware found: VIDEL STDMA-SCSI ST_MFP YM2149 PCM CODEC DSP56K SCC ANALOG_JOY Blitter tried to read byte from register ff8a00 at 007232
BLITTER IDE TT_CLK FDC_SPEED
[ 0.000000] NatFeats found (ARAnyM, 1.0)
[ 0.000000] On node 0 totalpages: 3584
[ 0.000000] free_area_init_node: node 0, pgdat 0037f7b0, node_mem_map 004b4000
[ 0.000000] DMA zone: 32 pages used for memmap
[ 0.000000] DMA zone: 0 pages reserved
[ 0.000000] DMA zone: 3552 pages, LIFO batch:0
[ 0.000000] On node 1 totalpages: 196608
[ 0.000000] free_area_init_node: node 1, pgdat 0038015c, node_mem_map 004d8090
[ 0.000000] DMA zone: 1728 pages used for memmap
[ 0.000000] DMA zone: 0 pages reserved
[ 0.000000] DMA zone: 194880 pages, LIFO batch:31
[ 0.000000] pcpu-alloc: s0 r0 d32768 u32768 alloc=1*32768
[ 0.000000] pcpu-alloc: [0] 0
[ 0.000000] Built 2 zonelists in Zone order, mobility grouping on. Total pages: 198432
[ 0.000000] Kernel command line: root=/dev/nfhd8p1 console=tty debug=par devtmpfs.mount=1 video=atafb:vga2 BOOT_IMAGE=vmlinux
[ 0.000000] PID hash table entries: 4096 (order: 2, 16384 bytes)
[ 0.000000] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
[ 0.000000] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
[ 0.000000] Memory: 787060k/787060k available (2756k kernel code, 10864k data, 112k init)
[ 0.000000] NR_IRQS:72
[ 0.000000] Console: colour dummy device 80x25
[ 0.000000] console [tty0] enabled
[ 0.010000] Calibrating delay loop... 94.61 BogoMIPS (lpjG3088)
[ 0.100000] pid_max: default: 32768 minimum: 301
[ 0.100000] Security Framework initialized
[ 0.100000] AppArmor: AppArmor disabled by boot time parameter
[ 0.100000] Mount-cache hash table entries: 512
[ 0.100000] Initializing cgroup subsys cpuacct
[ 0.100000] Initializing cgroup subsys memory
[ 0.100000] Initializing cgroup subsys devices
[ 0.100000] Initializing cgroup subsys freezer
[ 0.100000] Initializing cgroup subsys blkio
[ 0.110000] devtmpfs: initialized
[ 0.110000] print_constraints: dummy:
[ 0.110000] NET: Registered protocol family 16
[ 0.130000] bio: create slab <bio-0> at 0
[ 0.140000] SCSI subsystem initialized
[ 0.180000] NET: Registered protocol family 2
[ 0.180000] IP route cache hash table entries: 32768 (order: 5, 131072 bytes)
[ 0.190000] TCP established hash table entries: 131072 (order: 8, 1048576 bytes)
[ 0.210000] TCP bind hash table entries: 65536 (order: 6, 262144 bytes)
[ 0.220000] TCP: Hash tables configured (established 131072 bind 65536)
[ 0.220000] TCP reno registered
[ 0.220000] UDP hash table entries: 512 (order: 1, 8192 bytes)
[ 0.220000] UDP-Lite hash table entries: 512 (order: 1, 8192 bytes)
[ 0.220000] NET: Registered protocol family 1
[ 0.230000] nfhd8: found device with 20971440 blocks (512 bytes)
[ 0.230000] nfhd8: AHDI p1 p2
[ 0.230000] nfeth: API 5
[ 0.240000] eth0: nfeth addr:192.168.0.1 (192.168.0.2) HWaddr:52:54:00:22:00:01
[ 0.240000] audit: initializing netlink socket (disabled)
[ 0.240000] type 00 audit(1340698084.240:1): initialized
[ 0.240000] VFS: Disk quotas dquot_6.5.2
[ 0.240000] Dquot-cache hash table entries: 1024 (order 0, 4096 bytes)
[ 0.250000] msgmni has been set to 1537
[ 0.250000] alg: No test for stdrng (krng)
[ 0.250000] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 252)
[ 0.250000] io scheduler noop registered
[ 0.250000] io scheduler deadline registered
[ 0.250000] io scheduler cfq registered (default)
[ 0.250000] atafb_init: start
[ 0.250000] atafb_init: initializing Falcon hw
[ 0.250000] atafb: screen_base 00b9b000 real_screen_base 00b9b000 screen_len 81920
[ 0.260000] Determined 640x480, depth 1
[ 0.260000] virtual 640x1024
[ 0.270000] Console: switching to mono frame buffer device 80x30
[ 0.280000] fb0: frame buffer device, using 80K of video memory
[ 0.280000] Non-volatile memory driver v1.3
[ 0.290000] Atari floppy driver: max. HD, track buffering
[ 0.290000] Probing floppy drive(s):
[ 0.290000] fd0
[ 0.300000] Uniform Multi-Platform E-IDE driver
[ 0.320000] ide: Falcon IDE controller
[ 0.340000] Probing IDE interface ide0...
[ 0.720000] hda: Master, ATA DISK drive
[ 1.820000] ide0 at 0xfff00000 on irq 15 (serialized)
[ 1.820000] ide-gd driver 1.18
[ 1.840000] hda: max request size: 128KiB
[ 1.840000] hda: 20971440 sectors (10737 MB) w/256KiB Cache, CHS 805/16/63
[ 1.880000] hda: AHDI hda1 hda2
[ 1.890000] ide-cd driver 5.00
[ 1.900000] scsi0: options CAN_QUEUE=8 CMD_PER_LUN=1 SCAT-GAT=0 TAGGED-QUEUING=no HOSTID=7 generic options AUTOSENSE REAL DMA SCSI-2 TAGGED QUEUING generic release=7
[ 1.940000] scsi0 : Atari native SCSI
[ 1.950000] blk_queue_max_segments: set to minimum 1
[ 2.210000] blk_queue_max_segments: set to minimum 1
[ 2.220000] mousedev: PS/2 mouse device common for all mice
[ 2.470000] input: Atari Keyboard as /devices/virtual/input/input0
[ 2.480000] input: Atari mouse as /devices/virtual/input/input1
[ 2.740000] blk_queue_max_segments: set to minimum 1
[ 2.990000] blk_queue_max_segments: set to minimum 1
[ 3.250000] blk_queue_max_segments: set to minimum 1
[ 3.260000] rtc-generic rtc-generic: rtc core: registered rtc-generic as rtc0
[ 3.510000] blk_queue_max_segments: set to minimum 1
[ 3.520000] TCP cubic registered
[ 3.540000] NET: Registered protocol family 17
[ 3.540000] NET: Registered protocol family 15
[ 3.560000] registered taskstats version 1
[ 3.580000] scsi: waiting for bus probes to complete ...
[ 3.830000] blk_queue_max_segments: set to minimum 1
[ 4.140000] rtc-generic rtc-generic: setting system clock to 2012-07-26 08:08:08 UTC (1343290088)
[ 4.160000] EXT4-fs (nfhd8p1): mounting ext3 file system using the ext4 subsystem
[ 4.180000] EXT4-fs (nfhd8p1): INFO: recovery required on readonly filesystem
[ 4.200000] EXT4-fs (nfhd8p1): write access will be enabled during recovery
[ 10.660000] EXT4-fs (nfhd8p1): orphan cleanup on readonly fs
[ 10.660000] EXT4-fs (nfhd8p1): ext4_orphan_cleanup: deleting unreferenced inode 981142
[ 10.750000] EXT4-fs (nfhd8p1): ext4_orphan_cleanup: deleting unreferenced inode 654293
[ 10.790000] EXT4-fs (nfhd8p1): ext4_orphan_cleanup: deleting unreferenced inode 654433
[ 10.830000] EXT4-fs (nfhd8p1): ext4_orphan_cleanup: deleting unreferenced inode 1030237
[ 10.860000] EXT4-fs (nfhd8p1): 4 orphan inodes deleted
[ 10.880000] EXT4-fs (nfhd8p1): recovery complete
[ 10.900000] EXT4-fs (nfhd8p1): mounted filesystem with ordered data mode. Opts: (null)
[ 10.930000] VFS: Mounted root (ext3 filesystem) readonly on device 254:129.
[ 10.970000] devtmpfs: mounted
[ 19.180000] udevd[134]: starting version 175
Detected scancode offset = 8 (key: 'a' with scancode $26)
[ 25.680000] Adding 1989620k swap on /dev/nfhd8p2. Priority:-1 extents:1 across:1989620k
[ 26.090000] EXT4-fs (nfhd8p1): re-mounted. Opts: (null)
[ 27.510000] EXT4-fs (nfhd8p1): re-mounted. Opts: errors=remount-ro
[ 47.610000] [ cut here ]
[ 47.620000] WARNING: at /tmp/buildd/linux-3.2.21/kernel/irq/handle.c:130 handle_irq_event_percpu+0x130/0x13e()
[ 47.640000] irq 3 handler nfeth_interrupt+0x0/0x190 enabled interrupts
[ 47.660000] Modules linked in: evdev
[ 47.680000] Call Trace: [<000304ae>] warn_slowpath_common+0x48/0x6a
[ 47.700000] [<000304bc>] warn_slowpath_common+0x56/0x6a
[ 47.700000] [<00030548>] warn_slowpath_fmt+0x2a/0x32
[ 47.720000] [<00066696>] handle_irq_event_percpu+0x130/0x13e
[ 47.730000] [<002ac106>] printk+0x0/0x18
[ 47.740000] [<00066696>] handle_irq_event_percpu+0x130/0x13e
[ 47.760000] [<0000a338>] nfeth_interrupt+0x0/0x190
[ 47.770000] [<000666c4>] handle_irq_event+0x20/0x2c
[ 47.780000] [<00065ecc>] generic_handle_irq+0x2c/0x3a
[ 47.820000] [<00002940>] do_IRQ+0x20/0x32
[ 47.840000] [<00002766>] auto_irqhandler_fixup+0x4/0x6
[ 47.860000]
[ 47.860000] [ end trace 76d03722973df049 ]
[ 49.370000] NET: Registered protocol family 10
[ 60.040000] eth0: no IPv6 routers present
[ 596.120000] BUG: soft lockup - CPU#0 stuck for 23s! [https:1672]
[ 596.120000] Modules linked in: ipv6 evdev
[ 596.120000]
[ 596.120000] Format 00 Vector: 0114 PC: 00085602 Status: 2300 Tainted: G W
[ 596.120000] ORIG_D0: ffffffff D0: 00000000 A2: 2a57f510 A1: 2a40b6c8
[ 596.120000] A0: 00c9d170 D5: 00000000 D4: 2a649fcc
[ 596.120000] D3: 2a53c320 D2: 8001fda8 D1: 2a612000


Any help appreciated. I’m not assigning a culprit, just want this
solved. It seems to have popped up (more?) with the backport of
the GCC atomics stuff, which uses the syscall. That’s all I know.

TIA,
//mirabilos
08:05⎜<XTaran:#grml> mika: Does grml have an tool to read Apple
⎜ System Log (asl) files? :)
08:08⎜<ft:#grml> yeah. /bin/rm. ;) 08:09⎜<mrud:#grml> hexdump -C
08:31⎜<XTaran:#grml> ft, mrud: *g*


To UNSUBSCRIBE, email to
with a subject of "unsubscribe". Trouble? Contact
Archive: http://lists.debian.org/
Replies Reply to this message
Help Create a new topicNext page Replies Make a reply
Search Make your own search