WIP: Bulk elasticsearch scribe #79

bitemyapp · 2017-12-29T20:55:57Z

Okay so while this does use the bulk API I didn't see an appreciable throughput difference because of the load-shedding behavior of the default scribe. Hypothetically, it should let you do more logging with less stress on your ES shards.

However, the real purpose for this was that I wanted a logger that wouldn't load-shed. As far as I could tell from my testing, no documents get lost by the bulk scribe as long as it is permitted to finish a final sync before closing the scribe.

Attrition rates were as follows from my testing:

No added delay to simulate work: 12 microsecond average iteration time for the regular scribe, but the loss rate is pretty intense without a delay. 99.5% loss.

1 millisecond delay: 86%

18% at 5 milliseconds.

0% at 10 milliseconds.

All for the regular scribe. At four log messages per iteration, this implies hucking more than one log message (document) every 2.5 milliseconds will start resulting in load-shedding. I used Elasticsearch 5.5.0 running locally for the test.

The bulk scribe didn't lose anything as long as I gave it a second or two before closing the scribe.

My next concern was throughput. The regular and bulk scribe were broadly comparable in throughput, with the bulk scribe occasionally being a mite slower.

Then came tail latency / outlier iteration time.

There were time-to-complete outliers when the delay time was 1, 5, or 10 milliseconds per four log messages. At 100 milliseconds the outlier was barely greater than the average. I took the max of all runtimes to get the outlier, so this is worst case scenario.

For what it's worth, I think it is sensible to have both of these scribes but I am not sure which one to recommend. I feel like the trade-offs would ideally be more explicit than "default" and "bulk."

Profiling the bulk scribe didn't evince anything interesting. This was a real pleasure to write in Haskell and being able to mimic the structure of the original scribe saved me a lot of grief too.

Please let me know what you think.

bitemyapp · 2017-12-29T21:02:33Z

Repeating here for posterity:

The findings aren't from the benchmarks I wrote. The benchmarks aren't super helpful in their current state and I should think about changing them to match the private stress test I made to measure loss, tail latency, throughput.

MichaelXavier

Overall looks good to me aside from a few comments here and there. I will probably need a cleanup pass on comments. I think we should also workshop what we should recommend to the users. I agree that it sucks to load shed in a logger and if using the bulk API gets us out of that, is there a compelling reason to not make that the one supported path, even if it isn't a significant performance boon or a slight penalty?

MichaelXavier · 2017-12-29T21:14:41Z

katip/src/Katip/Core.hs

 -- internal mechanisms that will change without notice.
 module Katip.Core where

-------------------------------------------------------------------------------


Is there an automatic process here that's removing all of these comment sections? This is a coding style thing we do at Soostone.

No, just me. I usually avoid doing petty stuff like this but they rankled me. Sorry about that.

MichaelXavier · 2017-12-29T21:29:40Z

katip-elasticsearch/src/Katip/Scribes/ElasticSearch/Internal.hs

+
+    go :: Int -> [V5.BulkOperation] -> Int -> IO ()
+    go delayTime xs len = do
+      result <- race (UNB.readChan outChan) (threadDelay delayTime)


I'm a little bit worried about the safety of this race. IIRC race will run 2 async threads and will cancel the losing thread after the winner returns. Is it possible for the delay to hit simultaneously with an item being removed from the queue effectfully, the thread to be canceled and that item to be dropped on the floor?

It seems like maybe readChanOnException could be used to make this async exception safe (and the ThreadKilled exception would be delivered asynchronously). I guess the danger there is if the chan is full and the readChan was blocking, in the cleanup function we provide to readChanOnException you'd have to do a tryWriteChan to make a best effort to give it back without blocking.

I talked to Brandon Simmons, he said to spin tryRead in a loop for this purpose.

Suppose that makes sense. I'm kind of curious how hard that would be on the CPU. We're not in STM so we don't have a lot of options that wake up your loop only when things change. Does he suggest a check delay or anything to mitigate that?

Exact wording, hopefully he won't mind: "So if you've got a single reader then the clearest way to do what you want would probably be to use tryReadChan and make sure that if a blocking read times out that you retry (that's what tryReadChan gives you: the ability to block, or busy-wait and retry as much as you like), before proceeding with another readChan on the queue."

Sounds like busy-spin as we discussed. It's not clear to me how this avoids a faulty read if the raced timeout lands in the middle of the readChan.

Some arithmetic could be used to wait rather than attempting the genuine read if the time differential is within #{X} nanos of the timeout deadline but I don't like hitting up the clock in a hot loop like this.

Thanks for the additional clarification. Sounds good. I say let's move forward with this simple fallback-to-spin-loop approach and see where it gets us.

https://github.com/bitemyapp/break-unagi-chan Here's some results from my attempt at shaking out how unagi-chan would behave in the pessimal case.

I'm not 100% certain how to interpret the results (love to know what you think), but I got basically the same results from the regular and "exception safe" variants.

As it stands, I'd say that I would personally be comfortable going forward with this variant as a harm reducing relative to the current scribe's behavior (the load-shedding can get extreme)

All my alternatives would involve load-shedding (STM of bounded collection), livelock (TQueue, TBQueue), or unbounded memory use (STM of unbounded collection). I'd also have to be careful to not make the same mistake, whatever it was, that induces livelock in TQueue/TBQueue/TBMQueue.

I'll focus on exposing configuration bits for the moment, I'd like to know what you think meanwhile.

Thanks for the research! When you say going forward with this variant, you mean with the tryReadChan as suggested by the unagi-chan author?

https://github.com/bitemyapp/break-unagi-chan/blob/master/src/Lib.hs#L59-L60

you mean with the tryReadChan as suggested by the unagi-chan author?

I'm not 100% certain of what they meant, but definitely not a busy-spin. Instead I'm using timeout of the blocking IO action returned by tryReadChan or just readChan. System.Timeout.timeout seems to behave better when compared with race'ing a threadDelay and I would really like to know why.

I'm also not completely certain of the validity or meaning of my results in the README of the repository. I'd like a second pair of eyes on how I'm shaking it out if you or anyone else has the time.

I'd say that worst case scenario, we're still losing data but it's far-far less than the currently existing load shedding behavior and it's overall a more efficient scribe for ES resources.

Incoming text wall.

Took a closer look at the code and I'm starting to be convinced this is pretty hard to test. In your code you fill a chan with all the data that will ever be pushed to it. You then try your algorithms and give them decreasing "patience" for how long they'll wait to read data that we're certain is in the chan, down to like 1 microsecond (if I understand correctly) which I think may even be below the practicality of the scheduler. Eventually I would expect this to break down but I think this may just be testing the constant factors of each mechanism and not the safety.

The hypothetical data loss scenario we're talking about is when data is available to the chan at almost exactly the same time as the worker will give up, with the idea that a read from the chan isn't atomic and could get interrupted by a timeout and drop the message on the floor. In order to compare this across implementations somewhat deterministically I think you'd start with an empty queue and then coordinate many pushes to it that are all right around the time where it'd give up (and then have your implementation retry in any case). Like running a particle accelerator and looking for collisions. I think that means your implementation under test would need to keep track of the last number it saw so it could simply abort on first anomaly. That way, it could differentiate between successfully timing out but getting the data the next time and actually losing data.

So it seems like from the code there's 3 variations we want to consider:

race blocking read with threadDelay

race blocking read w/ attempted re-queueing with thread delay (so, theoretically safer, maybe bad perf, maybe not effective at preventing lost writes)

blocking version of tryReadChan.

I think we can probably eliminate number 3. It seems like in the unagi code, that's exactly equivalent to 1:

tryReadChan :: OutChan a -> IO (UT.Element a, IO a) tryReadChan oc = do -- no mask necessary (seg,segIx) <- startReadChan oc return ( ... ellided, promise verison unused... , readSegIxUnmasked id (seg,segIx)) readChan :: OutChan a -> IO a readChan = \oc-> startReadChan oc >>= readSegIxUnmasked id

So now we're down to: is readChanOnException with timeout safer than readChan with timeout? If so, is there a performance impact to using readChanOnException?

For checking safety, I'm not sure the best way but maybe we should just batter a queue with data and abort on first anomaly. So maybe:

Spawn a writer with random short publish delays roughly around the timeout in the reader.

Spawn a reader that retries, and remembers the last number it saw and blows up on first out of sequence.

Run the program for a while and see if we can ever produce the bug. We would do this with plain readChan first.

I feel like if we can get up to some high number of attempts without creating this bug, then we can probably just stop worrying about it, race readChan and call it a day. If we do create the bug, see if we can recreate it using the "safe" readChan. If not, see if perf disqualifies the safe variant.

I'll take a crack locally at your repo to see if I can rig this test setup.

MichaelXavier · 2017-12-29T21:31:11Z

katip-elasticsearch/src/Katip/Scribes/ElasticSearch/Internal.hs

+        (V5.DocId docId) val
+
+    discrimLen l
+      | l >= 500 = SendIt


I wonder if we should make this configurable.

The figure was the product of throughput testing. I don't know why someone would want to change it, but yeah probably should be configurable.

MichaelXavier · 2017-12-29T21:31:49Z

katip-elasticsearch/src/Katip/Scribes/ElasticSearch/Internal.hs

+startBulkWorker _ env mapping dieSignal outChan = do
+  -- We need to randomize upload delay
+  -- so that workers do not stampede the nodes
+  delayTime' <- randomRIO (100, 2000)


Should we make this configurable?

Almost certainly yes, I'll lift it out.

bitemyapp · 2018-04-16T15:46:18Z

Still working on this. Pretty close to having something we could choose to release which would be strictly better than the old scribe.

However,

The benchmark is still having tremendous data loss (albeit 10x more retained than the old scribe) and I'd really like to sort that out before pushing toward a release.

Qasaye007 · 2023-01-08T11:03:52Z

Learn me

bitemyapp added 7 commits October 18, 2017 16:05

Logging benchmarks

703a3e6

-O, target logging bench for now

2804530

putzing around with repetition, saving

389499c

At least the benchmark works now

228bab5

Putzing around with the bulk uploader

9a34b96

bulk example, benchmark changes

45ca549

House cleaning

f46253f

MichaelXavier requested changes Dec 29, 2017

View reviewed changes

Getting ready for a redesign

e5fad7c

bitemyapp changed the title ~~Bulk elasticsearch scribe~~ WIP: Bulk elasticsearch scribe Apr 8, 2018

bitemyapp added 5 commits April 8, 2018 02:38

Initial HasData rewrite, putzing around with benchmarks

4458e8b

drop asyncs for the moment

5eb6f80

merged

ad46260

Refactoring bulkScribe more

822fb99

Help me I am in Hell

a755825

Getting MVar exception on UNB.readChan in startBulkWorker

787deac

WIP: Bulk elasticsearch scribe #79

Are you sure you want to change the base?

WIP: Bulk elasticsearch scribe #79

Uh oh!

Conversation

bitemyapp commented Dec 29, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bitemyapp commented Dec 29, 2017

Uh oh!

MichaelXavier left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bitemyapp Jan 12, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bitemyapp commented Apr 16, 2018

Uh oh!

Qasaye007 commented Jan 8, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

bitemyapp commented Dec 29, 2017 •

edited

Loading

bitemyapp Jan 12, 2018 •

edited

Loading