volatile in C/C++ – Passion is like genius; a miracle.

KLDP 에 올렸던 글.

딱 한마디로 요약하자면, ‘volatile 쓰지마라’.

조금 길게 요약하자면, ‘volatile은 특정 용도이다. 컴파일러와 머신에 대한 완전한 이해없이는 volatile을 써서는 안되며, volatile을 그렇게 써봤자 non-portable해질 뿐이니 안쓰는게 낫다.’

그럼 뭘쓰란 말이오.
-> posix thread 같은 표준을 사용하세요…
라는 것이 답입니다.

맨 밑에 인용한 Scott Myers 의 글이 좋음.
특히 왜 volatile이 그렇게 복잡하고 머리아프게 만들어졌나에 대한 이해가 팍팍 됨…
(가령 nonvolatile과 volatile간의 reordering을 허용한 이유라던가.)
편집하기 귀찮아서 그냥 올림 -_-;

—

멀티 프로세서의 멀티 쓰레드 환경에서는 portable한 코드를 쓰고 싶다면 volatile을 사용해서는 안됩니다. 그리고 특히, C/C++에서 object에 대한 volatile 변수는 “undefined”되어 있어 동작을 보장할 수 없구요. (위의 예에서 객체에 대한 변수를 선언한건지 아닌지 잘 모르겠지만요.)

작년에 이쪽 이슈에 대해서 자바 커뮤니티내에서 논쟁이 있었습니다. 그때는 double checked locking 이 왜 불가능한가에대한 문제였는데, 설명하기도 일일히 대처하기도 너무 힘들었습니다. 그래서 여기서 C/C++의 volatile에 대해서 다시 설명하기는 힘들것 같고, Compaq Posix Thread Architect분의 인용문을 적겠습니다. “Compiler volatile semantics are not sufficient when sharing flag_ between threads, because the hardware, as well as the compiler, may reorder memory accesses arbitrarily, even with volatile. (Nor would a compiler implementation that issued memory barriers at each sequence point for volatile variables be sufficient, unless ALL data was volatile, which is impractical and unreasonably expansive.)” (원문 주소: http://tinyurl.com/5nk5s)

volatile은 일반적으로 portable한 코드를 위한 동기화 construct로서는
사용이 불가능합니다. 특히나 CPU가 2개 이상일 경우에는 예기치 못한 동작을 할 수 있고요…

인용한 글은 아래 주소에서 볼 수 있습니다.

http://tinyurl.com/49lhu

첫번째 인용문은 IBM 의 아키텍트이고 두번째 인용문은 HP의 Tru64 UNIX & VMS Thread Architect 의 글입니다.

“….
> – when the ‘volatile’ keyword must be used in
> multithreaded programming?

Never in PORTABLE threaded programs. The semantics of the C and
C++ “volatile” keyword are too loose, and insufficient, to have
any particular value with threads. You don’t need it if you’re
using portable synchronization (like a POSIX mutex or semaphore)
because the semantics of the synchronization object provide the
consistency you need between threads.

The only use for “volatile” is in certain non-portable
“optimizations” to synchronize at (possibly) lower cost in
certain specialized circumstances. That depends on knowing and
understanding the specific semantics of “volatile” under your
particular compiler, and what other machine-specific steps you
might need to take. (For example, using “memory barrier”
builtins or assembly code.)

In general, you’re best sticking with POSIX synchronization, in
which case you’ve got no use at all for “volatile”. That is,
unless you have some existing use for the feature having nothing
to do with threads, such as to allow access to a variable after
longjmp(), or in an asynchronous signal handler, or when
accessing hardware device registers.

—

The C language defines a series of “sequence points” in the “abstract
language model” at which variable values must be consistent with language
rules. An optimizer is allowed substantial leeway in reordering or
eliminating sequence points to minimize loads and stores or other
computation. EXCEPT that operations involving a “volatile” variable must
conform to the sequence points defined in the abstract model: there is no
leeway for optimization or other modifications. Thus, all changes
previously made must be visible at each sequence point, and no subsequent
modifications may be visible at that point. (In other words, as C99 points
out explicitly, if a compiler exactly implements the language abstract
semantics at all sequence points then “volatile” is redundant.)

On a multiprocessor (which C does not recognize), “sequence points” can only
be reasonably interpreted to refer to the view of memory from that
particular processor. (Otherwise the abstract model becomes too expensive
to be useful.) Therefore, volatile may say nothing at all about the
interaction between two threads running in parallel on a multiprocessor.

On a high-performance modern SMP system, memory transactions are effectively
pipelined. A memory barrier does not “flush to memory”, but rather inserts
barriers against reordering of operations in the memory pipeline. For this
to have any meaning across processors there must be a critical sequence on
EACH end of a transaction that’s protected by appropriate memory barriers.
This protocol has no possible meaning for an isolated volatile variable,
and therefore cannot be applied.

The protocol can only be employed to protect the relationship between two
items; e.g., “if I assert this flag then this data has been written” paired
with “if I can see the flag is asserted, then I know the data is valid”.

That’s how a mutex works. The mutex is a “flag” with builtin barriers
designed to enforce the visibility (and exclusion) contract with data
manipulations that occur while holding the mutex. Making the data volatile
contributes nothing to this protocol, but inhibits possibly valuable
compiler optimizations within the code that holds the mutex, reducing
program efficiency to no (positive) end.

If you have a way to generate inline barriers (or on a machine that doesn’t
require barriers), and you wish to build your own low-level protocol that
doesn’t rely on synchronization (e.g., a mutex), then your compiler might
require that you use volatile — but this is unspecified by either ANSI C
or POSIX. (That is, ANSI C doesn’t recognize parallelism and therefore
doesn’t apply, while POSIX applies no specific additional semantics to
“volatile”.) So IF you need volatile, your code is inherently nonportable.

A corollary is that if you wish to write portable code, you have no need for
volatile. (Or at least, if you think you do have a need, it won’t help you
any.)

In your case, trying to share (for unsynchronized read) a “volatile”
counter… OK. Fine. The use of volatile, portably, doesn’t help; but as
long as you’re not doing anything but “ticking” the counter, (not a lot of
room for optimization) it probably won’t hurt. IF your variable is of a
size and alignment that the hardware can modify atomically, and IF the
compiler chooses the right instructions (this may be more likely with
volatile, statistically, but again is by no means required by any
standard), then the worst that can happen is that you’ll read a stale
value. (Potentially an extremely stale value, unless there’s some
synchronization that ensures memory visibility between the threads at some
regular interval.) If the above conditions are NOT true, then you may read
“corrupted” values through word tearing and related effects.

If that’s acceptable, you’re probably OK… but volatile isn’t helping you.

—–

http://www.aristeia.com/Papers/DDJ_Jul_Aug_2004_revised.pdf

이 글이 충분한 설명이 되리라 생각합니다.
Scott Meyers 와 Andrei Alexandrescu가 쓴 글인데, 두 사람다 C/C++에서는 충분히 알려져있는 사람들이라 읽을 가치가 있을 것입니다. 특히 최근 자바의 concurrency에서는 매우 유명한 Doug Lea 등이 review 한 글입니다.