for loop - Concurrent access to maps with 'range' in Go -
the "go maps in action" entry in go blog states:
maps not safe concurrent use: it's not defined happens when read , write them simultaneously. if need read , write map concurrently executing goroutines, accesses must mediated kind of synchronization mechanism. 1 common way protect maps sync.rwmutex.
however, 1 common way access maps iterate on them range
keyword. not clear if purposes of concurrent access, execution inside range
loop "read", or "turnover" phase of loop. example, following code may or may not run afoul of "no concurrent r/w on maps" rule, depending on specific semantics / implementation of range
operation:
var testmap map[int]int testmaplock := make(chan bool, 1) testmaplock <- true testmapsequence := 0
...
func writetestmap(k, v int) { <-testmaplock testmap[k] = v testmapsequence++ testmaplock<-true } func iteratemapkeys(iteratorchannel chan int) error { <-testmaplock defer func() { testmaplock <- true } myseq := testmapsequence k, _ := range testmap { testmaplock <- true iteratorchannel <- k <-testmaplock if myseq != testmapsequence { close(iteratorchannel) return errors.new("concurrent modification") } } return nil }
the idea here range
"iterator" open when second function waiting consumer take next value, , writer not blocked @ time. however, never case 2 reads in single iterator on either side of write - "fail fast" iterator, borrow java term.
is there anywhere in language specification or other documents indicates if legitimate thing do, however? see going either way, , above quoted document not clear on consititutes "read". documentation seems totally quiet on concurrency aspects of for
/range
statement.
(please note question currency of for/range
, not duplicate of: golang concurrent map access range - use case different , asking precise locking requirement wrt 'range' keyword here!)
you using for
statement range
expression. quoting spec: statements:
the range expression evaluated once before beginning loop, 1 exception: if range expression array or pointer array , @ 1 iteration variable present, range expression's length evaluated; if length constant, by definition range expression not evaluated.
we're ranging on map, it's not exception: range expression evaluated once before beginning loop. range expression map variable testmap
:
for k, _ := range testmap {}
the map value not include key-value pairs, points data structure does. why important? because map value evaluated once, , if later pairs added map, map value –evaluated once before loop– map still points data structure includes new pairs. in contrast ranging on slice (which evaluated once too), header pointing backing array holding elements; if elements added slice during iteration, even if not result in allocating , copying on new backing array, not included in iteration (because slice header contains length - evaluated). appending elements slice may result in new slice value, adding pairs map not result in new map value.
now on iteration:
for k, v := range testmap { t1 := time.now() somefunction() t2 := time.now() }
before enter block, before t1 := time.now()
line k
, v
variables holding values of iteration, already read out map (else couldn't hold values). question: think map read for ... range
statement between t1
, t2
? under circumstances happen? have here single goroutine executing somefunc()
. able access map for
statement, either require another goroutine, or require suspend somefunc()
. neither of happen. (the for ... range
construct not multi-goroutine monster.) no matter how many iterations there are, while somefunc()
executed, map not accessed for
statement.
so answer 1 of questions: map not accessed inside for
block when executing iteration, accessed when k
, v
values set (assigned) next iteration. implies following iteration on map safe concurrent access:
var ( testmap = make(map[int]int) testmaplock = &sync.rwmutex{} ) func iteratemapkeys(iteratorchannel chan int) error { testmaplock.rlock() defer testmaplock.runlock() k, v := range testmap { testmaplock.runlock() somefunc() testmaplock.rlock() if somecond { return someerr } } return nil }
note unlocking in iteratemapkeys()
should (must) happen deferred statement, in original code may return "early" error, in case did not unlocked, means map remained locked! (here modeled if somecond {...}
).
also note type of locking ensures locking in case of concurrent access. it not prevent concurrent goroutine modify (e.g. add new pair) map. modification (if guarded write lock) safe, , loop may continue, there no guarantee if loop iterate on new pair:
if map entries have not yet been reached removed during iteration, corresponding iteration values not produced. if map entries created during iteration, entry may produced during iteration or may skipped. choice may vary each entry created , 1 iteration next.
the write-lock-guarded modification may this:
func writetestmap(k, v int) { testmaplock.lock() defer testmaplock.unlock() testmap[k] = v }
now if release read lock in block of for
, concurrent goroutine free grab write lock , make modifications map. in code:
testmaplock <- true iteratorchannel <- k <-testmaplock
when sending k
on iteratorchannel
, concurrent goroutine may modify map. not "unlucky" scenario, sending value on channel "blocking" operation, if channel's buffer full, goroutine must ready receive in order send operation proceed. sending value on channel scheduling point runtime run other goroutines on same os thread, not mention if there multiple os threads, of 1 may "waiting" write lock in order carry out map modification.
to sum last part: releasing read lock inside for
block yelling others: "come, modify map if dare!" consequently in code encountering myseq != testmapsequence
likely. see runnable example demonstrate (it's variation of example):
package main import ( "fmt" "math/rand" "sync" ) var ( testmap = make(map[int]int) testmaplock = &sync.rwmutex{} testmapsequence int ) func main() { go func() { { k := rand.intn(10000) writetestmap(k, 1) } }() ic := make(chan int) go func() { _ = range ic { } }() { if err := iteratemapkeys(ic); err != nil { fmt.println(err) } } } func writetestmap(k, v int) { testmaplock.lock() defer testmaplock.unlock() testmap[k] = v testmapsequence++ } func iteratemapkeys(iteratorchannel chan int) error { testmaplock.rlock() defer testmaplock.runlock() myseq := testmapsequence k, _ := range testmap { testmaplock.runlock() iteratorchannel <- k testmaplock.rlock() if myseq != testmapsequence { //close(iteratorchannel) return fmt.errorf("concurrent modification %d", testmapsequence) } } return nil }
example output:
concurrent modification 24 concurrent modification 41 concurrent modification 463 concurrent modification 477 concurrent modification 482 concurrent modification 496 concurrent modification 508 concurrent modification 521 concurrent modification 525 concurrent modification 535 concurrent modification 541 concurrent modification 555 concurrent modification 561 concurrent modification 565 concurrent modification 570 concurrent modification 577 concurrent modification 591 concurrent modification 593
we're encountering concurrent modification quite often!
do want avoid kind of concurrent modification? solution quite simple: don't release read lock inside for
. run app -race
option detect race conditions: go run -race testmap.go
final thoughts
the language spec allows modify map in same goroutine while ranging on it, previous quote relates ("if map entries have not yet been reached removed during iteration.... if map entries created during iteration..."). modifying map in same goroutine allowed , safe, how handled iterator logic not defined.
if map modified in goroutine, if use proper synchronization, the go memory model guarantees goroutine for ... range
observe modifications, , iterator logic see if "its own" goroutine have modified – allowed stated before.
Comments
Post a Comment