rogpeppe

rogpeppe

Member Since 12 years ago

InfluxData, Newcastle upon Tyne, UK

Experience Points
412
follower
Lessons Completed
0
follow
Lessons Completed
40
stars
Best Reply Awards
55
repos

761 contributions in the last year

Pinned
⚡ Avro codec and code generation for Go
⚡ Print where symbols are defined in Go source code
⚡ Dependable Go errors with tracebacks
⚡ Make temporary edits to your Go module dependencies
⚡ Selected Go-internal packages factored out from the standard library
⚡ Go helper types for issuing and handling HTTP requests
Activity
Nov
26
2 days ago
Activity icon
issue

rogpeppe issue comment golang/go

rogpeppe
rogpeppe

proposal: Go 2: error map for error handling

I propose the interrelated changes to the Go language:

  1. Each declared or imported function (including method) must automatically and implicitly declare and initialize an error map, as does the operator @function_name := map[string]bool{}. Alternatively the error_ or error. prefix can be used before the function name instead of the @ prefix, or the names of the function and the error map can be the same.

  2. The error map must be visible both inside the body of the function and in the scope (that is, in visibility area outside the function body) of the declared or imported function. The scope (that is, visibility area) of the error map is the same as scope (that is, visibility area) of the parameters of function.

But Apache Kafka attracts by the idea of a more flexible, dynamical and centralized management of the areas of visibility (topics) of messages (about errors).

  1. The content of the error map should be updated and visible instantly, well before the called function returns, so that the calling function can decide in advance whether the called function needs to be interrupted and how to handle errors.

Cases of assigning functions to variables and transferring functions to other functions etc require special research. Instead of a map, we can use another container for error messages, if it turns out to be more convenient: set, slice, stack, etc.

Description of use:

Programmers should use error types as keys in the error map.

Each function can throw several errors of different types and severity, which can then be handled in different ways (with or without exiting the function where the error occured, with or without return of parameters). If an error occurs, then the value of its type in the error map must be true. Therefore, the operator @function_name["error_type"] = true is required in the function body, but it's preferable that warning("error_type") and escape("error_type") (with escape from erroneous function) play its role.

If the corresponding function is used several times in the same scope (that is, in visibility area), then all different types of errors will appear in the error map each time when function is used.

If, when checking the expression @function_name["error_type"] in an if or switch statement, an error type was used that is not in the error map, then value false will be returned. It is convenient and obvious. A desision table can be used together with an error map for error handling and informing in difficult cases.

Benefits of the proposal:

  1. Very concise and obvious notation even for a novice Go programmer
  2. Change is backward compatible with Go 1 (replaces, but can be used in parallel with existing error handling methods). Therefore it can be done before Go 2
  3. Simple implementation
  4. Doesn't affect compilation time and performance
  5. Explicit and composite error naming in the calling function
  6. Аrbitrary and easy error naming in the function in which the error occurred (including dynamic name generation)
  7. Ability to send errors along the chain of function calls
  8. The compiler can catch unhandled explicitly specified errors
  9. Each function can throw several errors of different types and severity, which can then be handled in different ways (including with or without instantaneous exiting the function where the error occured, with or without returning parameters)
  10. If the corresponding function is used multiple times in the same scope (that is, in visibility area), then all different types of errors will be handled correctly
  11. A desision table can be used together with an error map for error handling and informing in difficult cases

Examples of code before and after proposal

// Before proposal

package main

import (
    "errors"
    "fmt"
    "strings"
)

func capitalize(name string) (string, error) {
    if name == "" {
        return "", errors.New("no name provided")
    }
    return strings.ToTitle(name), nil
}

func main() {
    _, err := capitalize("")
    if err != nil {
        fmt.Println("Could not capitalize:", err)
        return
    }
    fmt.Println("Success!")
}

// =================================================================

// After proposal

package main

import ( // "errors" is not imported
    "fmt"
    "strings"
)

func capitalize(name string) string { // also declares and initializes an error map @capitalize, as does the operator @capitalize := map[string]bool{}
    if name == "" {
	    warning("no name provided") // new keyword. Without escape from erroneous function. Equivalent to @capitalize["no name provided"] = true
      //escape("no name provided")   // new keyword. With escape from erroneous function
        return ""
    }
    return strings.ToTitle(name)
}

func main() {
    if @capitalize["no name provided"] {  // explicit error naming in the calling function after proposal
        fmt.Println("Could not capitalize: no name provided")
        return
    }
    fmt.Println("Success!")
}

questionnaire.xlsx questionnaire.txt

rogpeppe
rogpeppe
func f(c <-chan func()) {
    (<-c)()
    // How can I find out an error from the function call here?
}
Activity icon
issue

rogpeppe issue comment juju/ratelimit

rogpeppe
rogpeppe

If not enough tokens are available, no tokens should be removed from the bucket.

func (tb *Bucket) take(now time.Time, count int64, maxWait time.Duration) (time.Duration, bool) {
	...
	avail := tb.availableTokens - count
	if avail >= 0 {
		tb.availableTokens = avail
		return 0, true
	}
	...
	tb.availableTokens = avail
	return waitTime, true
}

If not enough tokens are available, no tokens should be removed from the bucket. (see https://en.wikipedia.org/wiki/Token_bucket#Algorithm) Otherwise, the availableTokens would be less than zero and would never be enough if many threads are trying to take token from the bucket circularly.

rogpeppe
rogpeppe

That function returns the number of taken tokens and the amount of time the caller must wait until they're available. If you don't sleep for that length of time, you're not following the contract. The reason its designed that way is so you can wait for other things to happen at the same time (for example by using that duration in a call to timer.Reset)

This is working as designed and as intended. Unless you can show an example of it not working correctly, this issue should be closed.

open pull request

rogpeppe wants to merge influxdata/influxdb_iox

rogpeppe
rogpeppe

fix: Implement a dummy Storage.Offsets method

Closes https://github.com/influxdata/conductor/issues/766

(@brettbuddin this is the reason why your https://github.com/influxdata/idpe/pull/12426 failed to deploy on staging)

rogpeppe
rogpeppe

ISTM that this might be better formatted one entry per line to avoid awkward to parse diffs like this. I dunno if that's ever idiomatic in Rust or not.

Activity icon
issue

rogpeppe issue golang/go

rogpeppe
rogpeppe

cmd/go2go: generic function type assignment fails to type check correctly

commit 34f76220c32bfe7766ca074d5ae69d6f716b9b2c

The following program seems like it should compile OK, but it does not:

package main

func main() {
}

type S1(type T) struct {
	x S2(T)
}

type S2(type T) struct {
	y sfunc(T)
}

type sfunc(type T) func(*S1(T))

func (*S1(T)) a() {}

func (i *S1(T)) b(f sfunc(T)) {
	i.x.y = f
}

It produces the error:

prog.go2:19:10: cannot use f (variable of type sfunc(T)) as sfunc(T) value in assignment
Activity icon
issue

rogpeppe issue comment golang/go

rogpeppe
rogpeppe

cmd/go2go: generic function type assignment fails to type check correctly

commit 34f76220c32bfe7766ca074d5ae69d6f716b9b2c

The following program seems like it should compile OK, but it does not:

package main

func main() {
}

type S1(type T) struct {
	x S2(T)
}

type S2(type T) struct {
	y sfunc(T)
}

type sfunc(type T) func(*S1(T))

func (*S1(T)) a() {}

func (i *S1(T)) b(f sfunc(T)) {
	i.x.y = f
}

It produces the error:

prog.go2:19:10: cannot use f (variable of type sfunc(T)) as sfunc(T) value in assignment
rogpeppe
rogpeppe

This now works correctly on tip.

Activity icon
issue

rogpeppe issue cue-lang/cue

rogpeppe
rogpeppe

comprehension inside definition gives "field not allowed" error

commit 9e7d4d63acf9cfcf7fdc33731dedfe420c0e2689

This code gives rise to an error, but it should not:

#D: {
	a: foo: 123
	b: {for k, v in a {(k): v}}
}

The error is:

#D.b: field not allowed: foo:
    ./y.cue:3:5
    ./y.cue:3:6
    ./y.cue:3:20
    ./y.cue:3:21
Nov
24
4 days ago
Activity icon
published release Fix Decoder bug

rogpeppe in influxdata/line-protocol create published release Fix Decoder bug

createdAt 4 days ago
Activity icon
created tag
createdAt 4 days ago
push

rogpeppe push influxdata/line-protocol

rogpeppe
rogpeppe

lineprotocol: fix Decoder bug when decoding from an io.Reader

When we needed more input at the end of the buffer, we were sliding the existing data to the front of the buffer, but ignoring the fact that existing tokens could be pointing to that data and hence overwrite it.

Avoid that possibility by sliding data only when we reset the decoder after each entry has been entirely decoded.

It's likely that this was the cause of https://github.com/influxdata/line-protocol/issues/50, although I haven't managed to recreate the panic. Specifically, in Decoder.NextField, this scenario could have happened:

  • we decode the field tag
  • we decode the field value, triggering a read request which corrupts the field tag by overwriting it.
  • there's an error decoding the value
  • we calculate the original encoded length of the tag by looking at its data, but the corrupted data now contains a character that needs escaping, which it didn't before, so the calculated length is longer than the original.
  • this results in us passing a startIndex value that's too large to d.syntaxErrorf
  • this results in the index out-of-bounds panic

The reason this issue wasn't caught by fuzzing was that we were only fuzzing with NewDecoderWithBytes, not with NewDecoder itself.

Performance seems largely unaffected:

name                                                           old time/op    new time/op    delta
DecodeEntriesSkipping/long-lines-8                               25.4ms ± 1%    25.4ms ± 1%       ~     (p=0.421 n=5+5)
DecodeEntriesSkipping/long-lines-with-escapes-8                  29.6ms ± 0%    29.4ms ± 0%     -0.60%  (p=0.008 n=5+5)
DecodeEntriesSkipping/single-short-line-8                         408ns ± 1%     407ns ± 1%       ~     (p=0.690 n=5+5)
DecodeEntriesSkipping/single-short-line-with-escapes-8            415ns ± 2%     418ns ± 2%       ~     (p=0.548 n=5+5)
DecodeEntriesSkipping/many-short-lines-8                          178ms ± 1%     175ms ± 1%     -1.49%  (p=0.008 n=5+5)
DecodeEntriesSkipping/field-key-escape-not-escapable-8            369ns ± 2%     367ns ± 0%       ~     (p=0.690 n=5+5)
DecodeEntriesSkipping/tag-value-triple-escape-space-8             447ns ± 2%     442ns ± 0%       ~     (p=0.151 n=5+5)
DecodeEntriesSkipping/procstat-8                                 3.43µs ± 2%    3.35µs ± 1%     -2.29%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/long-lines-8                        25.4ms ± 1%    25.2ms ± 0%     -0.68%  (p=0.016 n=5+5)
DecodeEntriesWithoutSkipping/long-lines-with-escapes-8            101ms ± 1%     101ms ± 0%       ~     (p=0.310 n=5+5)
DecodeEntriesWithoutSkipping/single-short-line-8                  442ns ± 2%     438ns ± 1%       ~     (p=0.310 n=5+5)
DecodeEntriesWithoutSkipping/single-short-line-with-escapes-8     467ns ± 2%     465ns ± 2%       ~     (p=1.000 n=5+5)
DecodeEntriesWithoutSkipping/many-short-lines-8                   205ms ± 2%     207ms ± 0%       ~     (p=0.222 n=5+5)
DecodeEntriesWithoutSkipping/field-key-escape-not-escapable-8     516ns ± 6%     420ns ± 2%    -18.60%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/tag-value-triple-escape-space-8      586ns ± 1%     510ns ± 1%    -13.05%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/procstat-8                          5.76µs ± 2%    6.23µs ± 0%     +8.11%  (p=0.016 n=5+4)

name                                                           old speed      new speed      delta
DecodeEntriesSkipping/long-lines-8                             1.03GB/s ± 1%  1.03GB/s ± 1%       ~     (p=0.421 n=5+5)
DecodeEntriesSkipping/long-lines-with-escapes-8                 886MB/s ± 0%   891MB/s ± 0%     +0.61%  (p=0.008 n=5+5)
DecodeEntriesSkipping/single-short-line-8                      71.0MB/s ± 1%  71.2MB/s ± 1%       ~     (p=0.690 n=5+5)
DecodeEntriesSkipping/single-short-line-with-escapes-8         77.2MB/s ± 2%  76.6MB/s ± 2%       ~     (p=0.548 n=5+5)
DecodeEntriesSkipping/many-short-lines-8                        147MB/s ± 1%   150MB/s ± 1%     +1.52%  (p=0.008 n=5+5)
DecodeEntriesSkipping/field-key-escape-not-escapable-8         89.4MB/s ± 2%  90.0MB/s ± 0%       ~     (p=0.690 n=5+5)
DecodeEntriesSkipping/tag-value-triple-escape-space-8           112MB/s ± 2%   113MB/s ± 0%       ~     (p=0.151 n=5+5)
DecodeEntriesSkipping/procstat-8                                387MB/s ± 1%   396MB/s ± 1%     +2.34%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/long-lines-8                      1.03GB/s ± 1%  1.04GB/s ± 0%     +0.68%  (p=0.016 n=5+5)
DecodeEntriesWithoutSkipping/long-lines-with-escapes-8          259MB/s ± 1%   261MB/s ± 0%       ~     (p=0.286 n=5+5)
DecodeEntriesWithoutSkipping/single-short-line-8               65.6MB/s ± 2%  66.2MB/s ± 1%       ~     (p=0.310 n=5+5)
DecodeEntriesWithoutSkipping/single-short-line-with-escapes-8  68.5MB/s ± 2%  68.9MB/s ± 2%       ~     (p=1.000 n=5+5)
DecodeEntriesWithoutSkipping/many-short-lines-8                 128MB/s ± 2%   126MB/s ± 0%       ~     (p=0.222 n=5+5)
DecodeEntriesWithoutSkipping/field-key-escape-not-escapable-8  64.1MB/s ± 6%  78.6MB/s ± 2%    +22.60%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/tag-value-triple-escape-space-8   85.3MB/s ± 1%  98.1MB/s ± 1%    +15.02%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/procstat-8                         230MB/s ± 2%   213MB/s ± 0%     -7.51%  (p=0.016 n=5+4)

name                                                           old alloc/op   new alloc/op   delta
DecodeEntriesSkipping/long-lines-8                                 512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesSkipping/long-lines-with-escapes-8                    512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesSkipping/single-short-line-8                          512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesSkipping/single-short-line-with-escapes-8             512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesSkipping/many-short-lines-8                           512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesSkipping/field-key-escape-not-escapable-8             512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesSkipping/tag-value-triple-escape-space-8              512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesSkipping/procstat-8                                   512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/long-lines-8                          512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/long-lines-with-escapes-8           17.4kB ± 0%    17.4kB ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/single-short-line-8                   512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/single-short-line-with-escapes-8      512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/many-short-lines-8                    512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/field-key-escape-not-escapable-8      512B ± 0%      514B ± 0%     +0.39%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/tag-value-triple-escape-space-8       512B ± 0%      514B ± 0%     +0.39%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/procstat-8                            512B ± 0%      784B ± 0%    +53.12%  (p=0.008 n=5+5)

name                                                           old allocs/op  new allocs/op  delta
DecodeEntriesSkipping/long-lines-8                                 1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesSkipping/long-lines-with-escapes-8                    1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesSkipping/single-short-line-8                          1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesSkipping/single-short-line-with-escapes-8             1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesSkipping/many-short-lines-8                           1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesSkipping/field-key-escape-not-escapable-8             1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesSkipping/tag-value-triple-escape-space-8              1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesSkipping/procstat-8                                   1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/long-lines-8                          1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/long-lines-with-escapes-8             7.00 ± 0%      7.00 ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/single-short-line-8                   1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/single-short-line-with-escapes-8      1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/many-short-lines-8                    1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/field-key-escape-not-escapable-8      1.00 ± 0%      2.00 ± 0%   +100.00%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/tag-value-triple-escape-space-8       1.00 ± 0%      2.00 ± 0%   +100.00%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/procstat-8                            1.00 ± 0%     30.00 ± 0%  +2900.00%  (p=0.008 n=5+5)
rogpeppe
rogpeppe

Merge pull request #51 from influxdata/rog-037-fix-read-bug

lineprotocol: fix Decoder bug when decoding from an io.Reader

commit sha: 8d23ab88a3da08ce86b05d8774a0e957f39944a2

push time in 4 days ago
pull request

rogpeppe pull request influxdata/line-protocol

rogpeppe
rogpeppe

lineprotocol: fix Decoder bug when decoding from an io.Reader

When we needed more input at the end of the buffer, we were sliding the existing data to the front of the buffer, but ignoring the fact that existing tokens could be pointing to that data and hence overwrite it.

Avoid that possibility by sliding data only when we reset the decoder after each entry has been entirely decoded.

It's likely that this was the cause of https://github.com/influxdata/line-protocol/issues/50, although I haven't managed to recreate the panic. Specifically, in Decoder.NextField, this scenario could have happened:

  • we decode the field tag
  • we decode the field value, triggering a read request which corrupts the field tag by overwriting it.
  • there's an error decoding the value
  • we calculate the original encoded length of the tag by looking at its data, but the corrupted data now contains a character that needs escaping, which it didn't before, so the calculated length is longer than the original.
  • this results in us passing a startIndex value that's too large to d.syntaxErrorf
  • this results in the index out-of-bounds panic

The reason this issue wasn't caught by fuzzing was that we were only fuzzing with NewDecoderWithBytes, not with NewDecoder itself.

Performance seems largely unaffected:

name                                                           old time/op    new time/op    delta
DecodeEntriesSkipping/long-lines-8                               25.4ms ± 1%    25.4ms ± 1%       ~     (p=0.421 n=5+5)
DecodeEntriesSkipping/long-lines-with-escapes-8                  29.6ms ± 0%    29.4ms ± 0%     -0.60%  (p=0.008 n=5+5)
DecodeEntriesSkipping/single-short-line-8                         408ns ± 1%     407ns ± 1%       ~     (p=0.690 n=5+5)
DecodeEntriesSkipping/single-short-line-with-escapes-8            415ns ± 2%     418ns ± 2%       ~     (p=0.548 n=5+5)
DecodeEntriesSkipping/many-short-lines-8                          178ms ± 1%     175ms ± 1%     -1.49%  (p=0.008 n=5+5)
DecodeEntriesSkipping/field-key-escape-not-escapable-8            369ns ± 2%     367ns ± 0%       ~     (p=0.690 n=5+5)
DecodeEntriesSkipping/tag-value-triple-escape-space-8             447ns ± 2%     442ns ± 0%       ~     (p=0.151 n=5+5)
DecodeEntriesSkipping/procstat-8                                 3.43µs ± 2%    3.35µs ± 1%     -2.29%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/long-lines-8                        25.4ms ± 1%    25.2ms ± 0%     -0.68%  (p=0.016 n=5+5)
DecodeEntriesWithoutSkipping/long-lines-with-escapes-8            101ms ± 1%     101ms ± 0%       ~     (p=0.310 n=5+5)
DecodeEntriesWithoutSkipping/single-short-line-8                  442ns ± 2%     438ns ± 1%       ~     (p=0.310 n=5+5)
DecodeEntriesWithoutSkipping/single-short-line-with-escapes-8     467ns ± 2%     465ns ± 2%       ~     (p=1.000 n=5+5)
DecodeEntriesWithoutSkipping/many-short-lines-8                   205ms ± 2%     207ms ± 0%       ~     (p=0.222 n=5+5)
DecodeEntriesWithoutSkipping/field-key-escape-not-escapable-8     516ns ± 6%     420ns ± 2%    -18.60%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/tag-value-triple-escape-space-8      586ns ± 1%     510ns ± 1%    -13.05%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/procstat-8                          5.76µs ± 2%    6.23µs ± 0%     +8.11%  (p=0.016 n=5+4)

name                                                           old speed      new speed      delta
DecodeEntriesSkipping/long-lines-8                             1.03GB/s ± 1%  1.03GB/s ± 1%       ~     (p=0.421 n=5+5)
DecodeEntriesSkipping/long-lines-with-escapes-8                 886MB/s ± 0%   891MB/s ± 0%     +0.61%  (p=0.008 n=5+5)
DecodeEntriesSkipping/single-short-line-8                      71.0MB/s ± 1%  71.2MB/s ± 1%       ~     (p=0.690 n=5+5)
DecodeEntriesSkipping/single-short-line-with-escapes-8         77.2MB/s ± 2%  76.6MB/s ± 2%       ~     (p=0.548 n=5+5)
DecodeEntriesSkipping/many-short-lines-8                        147MB/s ± 1%   150MB/s ± 1%     +1.52%  (p=0.008 n=5+5)
DecodeEntriesSkipping/field-key-escape-not-escapable-8         89.4MB/s ± 2%  90.0MB/s ± 0%       ~     (p=0.690 n=5+5)
DecodeEntriesSkipping/tag-value-triple-escape-space-8           112MB/s ± 2%   113MB/s ± 0%       ~     (p=0.151 n=5+5)
DecodeEntriesSkipping/procstat-8                                387MB/s ± 1%   396MB/s ± 1%     +2.34%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/long-lines-8                      1.03GB/s ± 1%  1.04GB/s ± 0%     +0.68%  (p=0.016 n=5+5)
DecodeEntriesWithoutSkipping/long-lines-with-escapes-8          259MB/s ± 1%   261MB/s ± 0%       ~     (p=0.286 n=5+5)
DecodeEntriesWithoutSkipping/single-short-line-8               65.6MB/s ± 2%  66.2MB/s ± 1%       ~     (p=0.310 n=5+5)
DecodeEntriesWithoutSkipping/single-short-line-with-escapes-8  68.5MB/s ± 2%  68.9MB/s ± 2%       ~     (p=1.000 n=5+5)
DecodeEntriesWithoutSkipping/many-short-lines-8                 128MB/s ± 2%   126MB/s ± 0%       ~     (p=0.222 n=5+5)
DecodeEntriesWithoutSkipping/field-key-escape-not-escapable-8  64.1MB/s ± 6%  78.6MB/s ± 2%    +22.60%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/tag-value-triple-escape-space-8   85.3MB/s ± 1%  98.1MB/s ± 1%    +15.02%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/procstat-8                         230MB/s ± 2%   213MB/s ± 0%     -7.51%  (p=0.016 n=5+4)

name                                                           old alloc/op   new alloc/op   delta
DecodeEntriesSkipping/long-lines-8                                 512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesSkipping/long-lines-with-escapes-8                    512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesSkipping/single-short-line-8                          512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesSkipping/single-short-line-with-escapes-8             512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesSkipping/many-short-lines-8                           512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesSkipping/field-key-escape-not-escapable-8             512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesSkipping/tag-value-triple-escape-space-8              512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesSkipping/procstat-8                                   512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/long-lines-8                          512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/long-lines-with-escapes-8           17.4kB ± 0%    17.4kB ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/single-short-line-8                   512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/single-short-line-with-escapes-8      512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/many-short-lines-8                    512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/field-key-escape-not-escapable-8      512B ± 0%      514B ± 0%     +0.39%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/tag-value-triple-escape-space-8       512B ± 0%      514B ± 0%     +0.39%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/procstat-8                            512B ± 0%      784B ± 0%    +53.12%  (p=0.008 n=5+5)

name                                                           old allocs/op  new allocs/op  delta
DecodeEntriesSkipping/long-lines-8                                 1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesSkipping/long-lines-with-escapes-8                    1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesSkipping/single-short-line-8                          1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesSkipping/single-short-line-with-escapes-8             1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesSkipping/many-short-lines-8                           1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesSkipping/field-key-escape-not-escapable-8             1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesSkipping/tag-value-triple-escape-space-8              1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesSkipping/procstat-8                                   1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/long-lines-8                          1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/long-lines-with-escapes-8             7.00 ± 0%      7.00 ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/single-short-line-8                   1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/single-short-line-with-escapes-8      1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/many-short-lines-8                    1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/field-key-escape-not-escapable-8      1.00 ± 0%      2.00 ± 0%   +100.00%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/tag-value-triple-escape-space-8       1.00 ± 0%      2.00 ± 0%   +100.00%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/procstat-8                            1.00 ± 0%     30.00 ± 0%  +2900.00%  (p=0.008 n=5+5)
open pull request

rogpeppe wants to merge heetch/avro

rogpeppe
rogpeppe

Allow UUID empty string decoding as empty UUID object

As we do in UUID encoding https://github.com/heetch/avro/blob/master/encode.go#L278-L279.

For the cases, we are encoding zero UUID, we sending to the wire "" so the decoding should behave the same way to hold a zero value in this case too.

rogpeppe
rogpeppe

How about just avoiding parsing entirely when the string is empty?

if frame.String == "" {
    // We produce the empty string when encoding the zero UUID value,
    // so allow it when decoding too.
    target.Set(gouuid.UUID{})
    break
}
val, err := gouuid.Parse(frame.String)
etc
pull request

rogpeppe merge to heetch/avro

rogpeppe
rogpeppe

Allow UUID empty string decoding as empty UUID object

As we do in UUID encoding https://github.com/heetch/avro/blob/master/encode.go#L278-L279.

For the cases, we are encoding zero UUID, we sending to the wire "" so the decoding should behave the same way to hold a zero value in this case too.

rogpeppe
rogpeppe

LGTM with one suggestion.

pull request

rogpeppe merge to heetch/avro

rogpeppe
rogpeppe

Allow UUID empty string decoding as empty UUID object

As we do in UUID encoding https://github.com/heetch/avro/blob/master/encode.go#L278-L279.

For the cases, we are encoding zero UUID, we sending to the wire "" so the decoding should behave the same way to hold a zero value in this case too.

rogpeppe
rogpeppe

LGTM with one suggestion.

Nov
23
5 days ago
Activity icon
issue

rogpeppe issue comment influxdata/line-protocol

rogpeppe
rogpeppe

lineprotocol: fix Decoder bug when decoding from an io.Reader

When we needed more input at the end of the buffer, we were sliding the existing data to the front of the buffer, but ignoring the fact that existing tokens could be pointing to that data and hence overwrite it.

Avoid that possibility by sliding data only when we reset the decoder after each entry has been entirely decoded.

It's likely that this was the cause of https://github.com/influxdata/line-protocol/issues/50, although I haven't managed to recreate the panic. Specifically, in Decoder.NextField, this scenario could have happened:

  • we decode the field tag
  • we decode the field value, triggering a read request which corrupts the field tag by overwriting it.
  • there's an error decoding the value
  • we calculate the original encoded length of the tag by looking at its data, but the corrupted data now contains a character that needs escaping, which it didn't before, so the calculated length is longer than the original.
  • this results in us passing a startIndex value that's too large to d.syntaxErrorf
  • this results in the index out-of-bounds panic

The reason this issue wasn't caught by fuzzing was that we were only fuzzing with NewDecoderWithBytes, not with NewDecoder itself.

Performance seems largely unaffected:

name                                                           old time/op    new time/op    delta
DecodeEntriesSkipping/long-lines-8                               25.4ms ± 1%    25.4ms ± 1%       ~     (p=0.421 n=5+5)
DecodeEntriesSkipping/long-lines-with-escapes-8                  29.6ms ± 0%    29.4ms ± 0%     -0.60%  (p=0.008 n=5+5)
DecodeEntriesSkipping/single-short-line-8                         408ns ± 1%     407ns ± 1%       ~     (p=0.690 n=5+5)
DecodeEntriesSkipping/single-short-line-with-escapes-8            415ns ± 2%     418ns ± 2%       ~     (p=0.548 n=5+5)
DecodeEntriesSkipping/many-short-lines-8                          178ms ± 1%     175ms ± 1%     -1.49%  (p=0.008 n=5+5)
DecodeEntriesSkipping/field-key-escape-not-escapable-8            369ns ± 2%     367ns ± 0%       ~     (p=0.690 n=5+5)
DecodeEntriesSkipping/tag-value-triple-escape-space-8             447ns ± 2%     442ns ± 0%       ~     (p=0.151 n=5+5)
DecodeEntriesSkipping/procstat-8                                 3.43µs ± 2%    3.35µs ± 1%     -2.29%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/long-lines-8                        25.4ms ± 1%    25.2ms ± 0%     -0.68%  (p=0.016 n=5+5)
DecodeEntriesWithoutSkipping/long-lines-with-escapes-8            101ms ± 1%     101ms ± 0%       ~     (p=0.310 n=5+5)
DecodeEntriesWithoutSkipping/single-short-line-8                  442ns ± 2%     438ns ± 1%       ~     (p=0.310 n=5+5)
DecodeEntriesWithoutSkipping/single-short-line-with-escapes-8     467ns ± 2%     465ns ± 2%       ~     (p=1.000 n=5+5)
DecodeEntriesWithoutSkipping/many-short-lines-8                   205ms ± 2%     207ms ± 0%       ~     (p=0.222 n=5+5)
DecodeEntriesWithoutSkipping/field-key-escape-not-escapable-8     516ns ± 6%     420ns ± 2%    -18.60%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/tag-value-triple-escape-space-8      586ns ± 1%     510ns ± 1%    -13.05%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/procstat-8                          5.76µs ± 2%    6.23µs ± 0%     +8.11%  (p=0.016 n=5+4)

name                                                           old speed      new speed      delta
DecodeEntriesSkipping/long-lines-8                             1.03GB/s ± 1%  1.03GB/s ± 1%       ~     (p=0.421 n=5+5)
DecodeEntriesSkipping/long-lines-with-escapes-8                 886MB/s ± 0%   891MB/s ± 0%     +0.61%  (p=0.008 n=5+5)
DecodeEntriesSkipping/single-short-line-8                      71.0MB/s ± 1%  71.2MB/s ± 1%       ~     (p=0.690 n=5+5)
DecodeEntriesSkipping/single-short-line-with-escapes-8         77.2MB/s ± 2%  76.6MB/s ± 2%       ~     (p=0.548 n=5+5)
DecodeEntriesSkipping/many-short-lines-8                        147MB/s ± 1%   150MB/s ± 1%     +1.52%  (p=0.008 n=5+5)
DecodeEntriesSkipping/field-key-escape-not-escapable-8         89.4MB/s ± 2%  90.0MB/s ± 0%       ~     (p=0.690 n=5+5)
DecodeEntriesSkipping/tag-value-triple-escape-space-8           112MB/s ± 2%   113MB/s ± 0%       ~     (p=0.151 n=5+5)
DecodeEntriesSkipping/procstat-8                                387MB/s ± 1%   396MB/s ± 1%     +2.34%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/long-lines-8                      1.03GB/s ± 1%  1.04GB/s ± 0%     +0.68%  (p=0.016 n=5+5)
DecodeEntriesWithoutSkipping/long-lines-with-escapes-8          259MB/s ± 1%   261MB/s ± 0%       ~     (p=0.286 n=5+5)
DecodeEntriesWithoutSkipping/single-short-line-8               65.6MB/s ± 2%  66.2MB/s ± 1%       ~     (p=0.310 n=5+5)
DecodeEntriesWithoutSkipping/single-short-line-with-escapes-8  68.5MB/s ± 2%  68.9MB/s ± 2%       ~     (p=1.000 n=5+5)
DecodeEntriesWithoutSkipping/many-short-lines-8                 128MB/s ± 2%   126MB/s ± 0%       ~     (p=0.222 n=5+5)
DecodeEntriesWithoutSkipping/field-key-escape-not-escapable-8  64.1MB/s ± 6%  78.6MB/s ± 2%    +22.60%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/tag-value-triple-escape-space-8   85.3MB/s ± 1%  98.1MB/s ± 1%    +15.02%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/procstat-8                         230MB/s ± 2%   213MB/s ± 0%     -7.51%  (p=0.016 n=5+4)

name                                                           old alloc/op   new alloc/op   delta
DecodeEntriesSkipping/long-lines-8                                 512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesSkipping/long-lines-with-escapes-8                    512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesSkipping/single-short-line-8                          512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesSkipping/single-short-line-with-escapes-8             512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesSkipping/many-short-lines-8                           512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesSkipping/field-key-escape-not-escapable-8             512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesSkipping/tag-value-triple-escape-space-8              512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesSkipping/procstat-8                                   512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/long-lines-8                          512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/long-lines-with-escapes-8           17.4kB ± 0%    17.4kB ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/single-short-line-8                   512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/single-short-line-with-escapes-8      512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/many-short-lines-8                    512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/field-key-escape-not-escapable-8      512B ± 0%      514B ± 0%     +0.39%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/tag-value-triple-escape-space-8       512B ± 0%      514B ± 0%     +0.39%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/procstat-8                            512B ± 0%      784B ± 0%    +53.12%  (p=0.008 n=5+5)

name                                                           old allocs/op  new allocs/op  delta
DecodeEntriesSkipping/long-lines-8                                 1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesSkipping/long-lines-with-escapes-8                    1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesSkipping/single-short-line-8                          1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesSkipping/single-short-line-with-escapes-8             1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesSkipping/many-short-lines-8                           1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesSkipping/field-key-escape-not-escapable-8             1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesSkipping/tag-value-triple-escape-space-8              1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesSkipping/procstat-8                                   1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/long-lines-8                          1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/long-lines-with-escapes-8             7.00 ± 0%      7.00 ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/single-short-line-8                   1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/single-short-line-with-escapes-8      1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/many-short-lines-8                    1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/field-key-escape-not-escapable-8      1.00 ± 0%      2.00 ± 0%   +100.00%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/tag-value-triple-escape-space-8       1.00 ± 0%      2.00 ± 0%   +100.00%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/procstat-8                            1.00 ± 0%     30.00 ± 0%  +2900.00%  (p=0.008 n=5+5)
rogpeppe
rogpeppe

@philjb

nice find! how did you locate it in the end?

One of the end-to-end Flux tests was failing. I isolated the data that caused the problem by running every write through both decoders and checking when the results differed. Then I realised that the issue didn't happen when using NewDecoderWithBytes which led swiftly to the problem area.

pull request

rogpeppe pull request influxdata/line-protocol

rogpeppe
rogpeppe

lineprotocol: fix Decoder bug when decoding from an io.Reader

When we needed more input at the end of the buffer, we were sliding the existing data to the front of the buffer, but ignoring the fact that existing tokens could be pointing to that data and hence overwrite it.

Avoid that possibility by sliding data only when we reset the decoder after each entry has been entirely decoded.

It's likely that this was the cause of https://github.com/influxdata/line-protocol/issues/50, although I haven't managed to recreate the panic. Specifically, in Decoder.NextField, this scenario could have happened:

  • we decode the field tag
  • we decode the field value, triggering a read request which corrupts the field tag by overwriting it.
  • there's an error decoding the value
  • we calculate the original encoded length of the tag by looking at its data, but the corrupted data now contains a character that needs escaping, which it didn't before, so the calculated length is longer than the original.
  • this results in us passing a startIndex value that's too large to d.syntaxErrorf
  • this results in the index out-of-bounds panic

The reason this issue wasn't caught by fuzzing was that we were only fuzzing with NewDecoderWithBytes, not with NewDecoder itself.

Performance seems largely unaffected:

name                                                           old time/op    new time/op    delta
DecodeEntriesSkipping/long-lines-8                               25.4ms ± 1%    25.4ms ± 1%       ~     (p=0.421 n=5+5)
DecodeEntriesSkipping/long-lines-with-escapes-8                  29.6ms ± 0%    29.4ms ± 0%     -0.60%  (p=0.008 n=5+5)
DecodeEntriesSkipping/single-short-line-8                         408ns ± 1%     407ns ± 1%       ~     (p=0.690 n=5+5)
DecodeEntriesSkipping/single-short-line-with-escapes-8            415ns ± 2%     418ns ± 2%       ~     (p=0.548 n=5+5)
DecodeEntriesSkipping/many-short-lines-8                          178ms ± 1%     175ms ± 1%     -1.49%  (p=0.008 n=5+5)
DecodeEntriesSkipping/field-key-escape-not-escapable-8            369ns ± 2%     367ns ± 0%       ~     (p=0.690 n=5+5)
DecodeEntriesSkipping/tag-value-triple-escape-space-8             447ns ± 2%     442ns ± 0%       ~     (p=0.151 n=5+5)
DecodeEntriesSkipping/procstat-8                                 3.43µs ± 2%    3.35µs ± 1%     -2.29%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/long-lines-8                        25.4ms ± 1%    25.2ms ± 0%     -0.68%  (p=0.016 n=5+5)
DecodeEntriesWithoutSkipping/long-lines-with-escapes-8            101ms ± 1%     101ms ± 0%       ~     (p=0.310 n=5+5)
DecodeEntriesWithoutSkipping/single-short-line-8                  442ns ± 2%     438ns ± 1%       ~     (p=0.310 n=5+5)
DecodeEntriesWithoutSkipping/single-short-line-with-escapes-8     467ns ± 2%     465ns ± 2%       ~     (p=1.000 n=5+5)
DecodeEntriesWithoutSkipping/many-short-lines-8                   205ms ± 2%     207ms ± 0%       ~     (p=0.222 n=5+5)
DecodeEntriesWithoutSkipping/field-key-escape-not-escapable-8     516ns ± 6%     420ns ± 2%    -18.60%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/tag-value-triple-escape-space-8      586ns ± 1%     510ns ± 1%    -13.05%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/procstat-8                          5.76µs ± 2%    6.23µs ± 0%     +8.11%  (p=0.016 n=5+4)

name                                                           old speed      new speed      delta
DecodeEntriesSkipping/long-lines-8                             1.03GB/s ± 1%  1.03GB/s ± 1%       ~     (p=0.421 n=5+5)
DecodeEntriesSkipping/long-lines-with-escapes-8                 886MB/s ± 0%   891MB/s ± 0%     +0.61%  (p=0.008 n=5+5)
DecodeEntriesSkipping/single-short-line-8                      71.0MB/s ± 1%  71.2MB/s ± 1%       ~     (p=0.690 n=5+5)
DecodeEntriesSkipping/single-short-line-with-escapes-8         77.2MB/s ± 2%  76.6MB/s ± 2%       ~     (p=0.548 n=5+5)
DecodeEntriesSkipping/many-short-lines-8                        147MB/s ± 1%   150MB/s ± 1%     +1.52%  (p=0.008 n=5+5)
DecodeEntriesSkipping/field-key-escape-not-escapable-8         89.4MB/s ± 2%  90.0MB/s ± 0%       ~     (p=0.690 n=5+5)
DecodeEntriesSkipping/tag-value-triple-escape-space-8           112MB/s ± 2%   113MB/s ± 0%       ~     (p=0.151 n=5+5)
DecodeEntriesSkipping/procstat-8                                387MB/s ± 1%   396MB/s ± 1%     +2.34%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/long-lines-8                      1.03GB/s ± 1%  1.04GB/s ± 0%     +0.68%  (p=0.016 n=5+5)
DecodeEntriesWithoutSkipping/long-lines-with-escapes-8          259MB/s ± 1%   261MB/s ± 0%       ~     (p=0.286 n=5+5)
DecodeEntriesWithoutSkipping/single-short-line-8               65.6MB/s ± 2%  66.2MB/s ± 1%       ~     (p=0.310 n=5+5)
DecodeEntriesWithoutSkipping/single-short-line-with-escapes-8  68.5MB/s ± 2%  68.9MB/s ± 2%       ~     (p=1.000 n=5+5)
DecodeEntriesWithoutSkipping/many-short-lines-8                 128MB/s ± 2%   126MB/s ± 0%       ~     (p=0.222 n=5+5)
DecodeEntriesWithoutSkipping/field-key-escape-not-escapable-8  64.1MB/s ± 6%  78.6MB/s ± 2%    +22.60%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/tag-value-triple-escape-space-8   85.3MB/s ± 1%  98.1MB/s ± 1%    +15.02%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/procstat-8                         230MB/s ± 2%   213MB/s ± 0%     -7.51%  (p=0.016 n=5+4)

name                                                           old alloc/op   new alloc/op   delta
DecodeEntriesSkipping/long-lines-8                                 512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesSkipping/long-lines-with-escapes-8                    512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesSkipping/single-short-line-8                          512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesSkipping/single-short-line-with-escapes-8             512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesSkipping/many-short-lines-8                           512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesSkipping/field-key-escape-not-escapable-8             512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesSkipping/tag-value-triple-escape-space-8              512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesSkipping/procstat-8                                   512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/long-lines-8                          512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/long-lines-with-escapes-8           17.4kB ± 0%    17.4kB ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/single-short-line-8                   512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/single-short-line-with-escapes-8      512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/many-short-lines-8                    512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/field-key-escape-not-escapable-8      512B ± 0%      514B ± 0%     +0.39%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/tag-value-triple-escape-space-8       512B ± 0%      514B ± 0%     +0.39%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/procstat-8                            512B ± 0%      784B ± 0%    +53.12%  (p=0.008 n=5+5)

name                                                           old allocs/op  new allocs/op  delta
DecodeEntriesSkipping/long-lines-8                                 1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesSkipping/long-lines-with-escapes-8                    1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesSkipping/single-short-line-8                          1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesSkipping/single-short-line-with-escapes-8             1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesSkipping/many-short-lines-8                           1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesSkipping/field-key-escape-not-escapable-8             1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesSkipping/tag-value-triple-escape-space-8              1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesSkipping/procstat-8                                   1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/long-lines-8                          1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/long-lines-with-escapes-8             7.00 ± 0%      7.00 ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/single-short-line-8                   1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/single-short-line-with-escapes-8      1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/many-short-lines-8                    1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/field-key-escape-not-escapable-8      1.00 ± 0%      2.00 ± 0%   +100.00%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/tag-value-triple-escape-space-8       1.00 ± 0%      2.00 ± 0%   +100.00%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/procstat-8                            1.00 ± 0%     30.00 ± 0%  +2900.00%  (p=0.008 n=5+5)
push

rogpeppe push influxdata/line-protocol

rogpeppe
rogpeppe

lineprotocol: fix Decoder bug when decoding from an io.Reader

When we needed more input at the end of the buffer, we were sliding the existing data to the front of the buffer, but ignoring the fact that existing tokens could be pointing to that data and hence overwrite it.

Avoid that possibility by sliding data only when we reset the decoder after each entry has been entirely decoded.

It's likely that this was the cause of https://github.com/influxdata/line-protocol/issues/50, although I haven't managed to recreate the panic. Specifically, in Decoder.NextField, this scenario could have happened:

  • we decode the field tag
  • we decode the field value, triggering a read request which corrupts the field tag by overwriting it.
  • there's an error decoding the value
  • we calculate the original encoded length of the tag by looking at its data, but the corrupted data now contains a character that needs escaping, which it didn't before, so the calculated length is longer than the original.
  • this results in us passing a startIndex value that's too large to d.syntaxErrorf
  • this results in the index out-of-bounds panic

The reason this issue wasn't caught by fuzzing was that we were only fuzzing with NewDecoderWithBytes, not with NewDecoder itself.

Performance seems largely unaffected:

name                                                           old time/op    new time/op    delta
DecodeEntriesSkipping/long-lines-8                               25.4ms ± 1%    25.4ms ± 1%       ~     (p=0.421 n=5+5)
DecodeEntriesSkipping/long-lines-with-escapes-8                  29.6ms ± 0%    29.4ms ± 0%     -0.60%  (p=0.008 n=5+5)
DecodeEntriesSkipping/single-short-line-8                         408ns ± 1%     407ns ± 1%       ~     (p=0.690 n=5+5)
DecodeEntriesSkipping/single-short-line-with-escapes-8            415ns ± 2%     418ns ± 2%       ~     (p=0.548 n=5+5)
DecodeEntriesSkipping/many-short-lines-8                          178ms ± 1%     175ms ± 1%     -1.49%  (p=0.008 n=5+5)
DecodeEntriesSkipping/field-key-escape-not-escapable-8            369ns ± 2%     367ns ± 0%       ~     (p=0.690 n=5+5)
DecodeEntriesSkipping/tag-value-triple-escape-space-8             447ns ± 2%     442ns ± 0%       ~     (p=0.151 n=5+5)
DecodeEntriesSkipping/procstat-8                                 3.43µs ± 2%    3.35µs ± 1%     -2.29%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/long-lines-8                        25.4ms ± 1%    25.2ms ± 0%     -0.68%  (p=0.016 n=5+5)
DecodeEntriesWithoutSkipping/long-lines-with-escapes-8            101ms ± 1%     101ms ± 0%       ~     (p=0.310 n=5+5)
DecodeEntriesWithoutSkipping/single-short-line-8                  442ns ± 2%     438ns ± 1%       ~     (p=0.310 n=5+5)
DecodeEntriesWithoutSkipping/single-short-line-with-escapes-8     467ns ± 2%     465ns ± 2%       ~     (p=1.000 n=5+5)
DecodeEntriesWithoutSkipping/many-short-lines-8                   205ms ± 2%     207ms ± 0%       ~     (p=0.222 n=5+5)
DecodeEntriesWithoutSkipping/field-key-escape-not-escapable-8     516ns ± 6%     420ns ± 2%    -18.60%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/tag-value-triple-escape-space-8      586ns ± 1%     510ns ± 1%    -13.05%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/procstat-8                          5.76µs ± 2%    6.23µs ± 0%     +8.11%  (p=0.016 n=5+4)

name                                                           old speed      new speed      delta
DecodeEntriesSkipping/long-lines-8                             1.03GB/s ± 1%  1.03GB/s ± 1%       ~     (p=0.421 n=5+5)
DecodeEntriesSkipping/long-lines-with-escapes-8                 886MB/s ± 0%   891MB/s ± 0%     +0.61%  (p=0.008 n=5+5)
DecodeEntriesSkipping/single-short-line-8                      71.0MB/s ± 1%  71.2MB/s ± 1%       ~     (p=0.690 n=5+5)
DecodeEntriesSkipping/single-short-line-with-escapes-8         77.2MB/s ± 2%  76.6MB/s ± 2%       ~     (p=0.548 n=5+5)
DecodeEntriesSkipping/many-short-lines-8                        147MB/s ± 1%   150MB/s ± 1%     +1.52%  (p=0.008 n=5+5)
DecodeEntriesSkipping/field-key-escape-not-escapable-8         89.4MB/s ± 2%  90.0MB/s ± 0%       ~     (p=0.690 n=5+5)
DecodeEntriesSkipping/tag-value-triple-escape-space-8           112MB/s ± 2%   113MB/s ± 0%       ~     (p=0.151 n=5+5)
DecodeEntriesSkipping/procstat-8                                387MB/s ± 1%   396MB/s ± 1%     +2.34%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/long-lines-8                      1.03GB/s ± 1%  1.04GB/s ± 0%     +0.68%  (p=0.016 n=5+5)
DecodeEntriesWithoutSkipping/long-lines-with-escapes-8          259MB/s ± 1%   261MB/s ± 0%       ~     (p=0.286 n=5+5)
DecodeEntriesWithoutSkipping/single-short-line-8               65.6MB/s ± 2%  66.2MB/s ± 1%       ~     (p=0.310 n=5+5)
DecodeEntriesWithoutSkipping/single-short-line-with-escapes-8  68.5MB/s ± 2%  68.9MB/s ± 2%       ~     (p=1.000 n=5+5)
DecodeEntriesWithoutSkipping/many-short-lines-8                 128MB/s ± 2%   126MB/s ± 0%       ~     (p=0.222 n=5+5)
DecodeEntriesWithoutSkipping/field-key-escape-not-escapable-8  64.1MB/s ± 6%  78.6MB/s ± 2%    +22.60%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/tag-value-triple-escape-space-8   85.3MB/s ± 1%  98.1MB/s ± 1%    +15.02%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/procstat-8                         230MB/s ± 2%   213MB/s ± 0%     -7.51%  (p=0.016 n=5+4)

name                                                           old alloc/op   new alloc/op   delta
DecodeEntriesSkipping/long-lines-8                                 512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesSkipping/long-lines-with-escapes-8                    512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesSkipping/single-short-line-8                          512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesSkipping/single-short-line-with-escapes-8             512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesSkipping/many-short-lines-8                           512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesSkipping/field-key-escape-not-escapable-8             512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesSkipping/tag-value-triple-escape-space-8              512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesSkipping/procstat-8                                   512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/long-lines-8                          512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/long-lines-with-escapes-8           17.4kB ± 0%    17.4kB ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/single-short-line-8                   512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/single-short-line-with-escapes-8      512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/many-short-lines-8                    512B ± 0%      512B ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/field-key-escape-not-escapable-8      512B ± 0%      514B ± 0%     +0.39%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/tag-value-triple-escape-space-8       512B ± 0%      514B ± 0%     +0.39%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/procstat-8                            512B ± 0%      784B ± 0%    +53.12%  (p=0.008 n=5+5)

name                                                           old allocs/op  new allocs/op  delta
DecodeEntriesSkipping/long-lines-8                                 1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesSkipping/long-lines-with-escapes-8                    1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesSkipping/single-short-line-8                          1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesSkipping/single-short-line-with-escapes-8             1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesSkipping/many-short-lines-8                           1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesSkipping/field-key-escape-not-escapable-8             1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesSkipping/tag-value-triple-escape-space-8              1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesSkipping/procstat-8                                   1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/long-lines-8                          1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/long-lines-with-escapes-8             7.00 ± 0%      7.00 ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/single-short-line-8                   1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/single-short-line-with-escapes-8      1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/many-short-lines-8                    1.00 ± 0%      1.00 ± 0%       ~     (all equal)
DecodeEntriesWithoutSkipping/field-key-escape-not-escapable-8      1.00 ± 0%      2.00 ± 0%   +100.00%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/tag-value-triple-escape-space-8       1.00 ± 0%      2.00 ± 0%   +100.00%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/procstat-8                            1.00 ± 0%     30.00 ± 0%  +2900.00%  (p=0.008 n=5+5)

commit sha: 6e963d9112f1882630fbfae02ce6de0c539948a3

push time in 5 days ago
push

rogpeppe push influxdata/line-protocol

rogpeppe
rogpeppe

lineprotocol: fix Decoder bug when decoding from an io.Reader

When we needed more input at the end of the buffer, we were sliding the existing data to the front of the buffer, but ignoring the fact that existing tokens could be pointing to that data and hence overwrite it.

Avoid that possibility by sliding data only when we reset the decoder after each entry has been entirely decoded.

It's likely that this was the cause of https://github.com/influxdata/line-protocol/issues/50, although I haven't managed to recreate the panic. Specifically, in Decoder.NextField, this scenario could have happened:

  • we decode the field tag
  • we decode the field value, triggering a read request which corrupts the field tag by overwriting it.
  • there's an error decoding the value
  • we calculate the original encoded length of the tag by looking at its data, but the corrupted data now contains a character that needs escaping, which it didn't before, so the calculated length is longer than the original.
  • this results in us passing a startIndex value that's too large to d.syntaxErrorf
  • this results in the index out-of-bounds panic

The reason this issue wasn't caught by fuzzing was that we were only fuzzing with NewDecoderWithBytes, not with NewDecoder itself.

commit sha: e16f114fdc553051a20b4615c6759ca8ebcd7834

push time in 5 days ago
Activity icon
created branch

rogpeppe in influxdata/line-protocol create branch rog-037-fix-read-bug

createdAt 5 days ago
Nov
22
6 days ago
push

rogpeppe push influxdata/openapi

rogpeppe
rogpeppe
rogpeppe
rogpeppe

remove extra newlines from import blocks

rogpeppe
rogpeppe

update generate script and regenerate all and validate too

commit sha: 14277ab8513b99a1f1145b5d3d5fce64b8ff50f3

push time in 6 days ago
Activity icon
issue

rogpeppe issue influxdata/openapi

rogpeppe
rogpeppe

schemas fail swagger-cli validation

commit 12c8165b70cb379486592ffbf43ac5a3dde04904

By way of sanity checking, I thought I'd run the schemas through the swagger-cli validator, and they don't seem to pass validation:

% swagger-cli validate oss.yml
Swagger schema validation failed. 
  Data does not match any schemas from 'oneOf' at #/components/securitySchemes/TokenAuthentication
    Missing required property: $ref at #/components/securitySchemes/TokenAuthentication
    Data does not match any schemas from 'oneOf' at #/components/securitySchemes/TokenAuthentication
      Missing required property: in at #/
      Data does not match any schemas from 'oneOf' at #/
        No enum match for: token at #/scheme
        Data matches schema from 'not' at #/
      Missing required property: flows at #/
      Missing required property: openIdConnectUrl at #/
 
JSON_OBJECT_VALIDATION_FAILED

It seems that the security schemas don't match the OpenAPI 3.1 specification (for example the TokenAuthentation scheme should probably be defined as scheme: bearer as a token scheme doesn't seem to exist.

Activity icon
issue

rogpeppe issue cue-lang/cue

rogpeppe
rogpeppe

cmd/cue: multiline yaml string causes odd import formatting

commit 9217c4d0b6f86ba942e66ea6049e1517e43a7fdf

The example below formats strangely when using cue import. The field names and values seem to get out of sync, e.g.

		b:
			"y", c:
			"x"

This behaviour doesn't happen when using cue export on the same file. Changing the contents of the multiline string (e.g. removing the trailing hyphen on the first line) changes the behaviour.

I saw this behaviour happen a lot in a real world import example.

exec cue export --out cue x.yml -o x-export.cue
exec cue import x.yml
cmp x.cue x-export.cue

-- x.yml --
x:
  d: |
   xxxxxxxxxxxx –
   xxxxxxxxx – xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
   xxxxxxxxxxxx – xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
   xxxxxxxxxx – xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

  p:
    - b: y
      c: x

Maybe related to #826.

push

rogpeppe push influxdata/openapi

rogpeppe
rogpeppe

moving towards bringing in the other services too

commit sha: 72c533ca76bb1d0b4ace5ad5b16de0db126dd713

push time in 6 days ago
Activity icon
issue

rogpeppe issue cue-lang/cue

rogpeppe
rogpeppe

cmd/cue: misleading error report

commit 9217c4d0b6f86ba942e66ea6049e1517e43a7fdf

I've left some of the original structure here to give a flavour of the kind of thing I was dealing with. The example this derived from consisted over 60k lines of CUE.

I spent a couple of hours to trying to work out what the conflict was here. The problem is that for some reason, the first error doesn't seem like it should be a conflict at all - it says that the properties field isn't allowed inside cloud.components.schemas.OnboardingResponse.properties.bucket, but it seems clear that that path does allow a properties field (by virtue of allowing commonschemas.Bucket which has a properties field).

The actual error is highlighted by the second message, but that breaks the usual flow of fixing the first error that we see, then moving on. In the original, there were many errors and the retentionRules one was 5th in the list.

exec cue vet -c

-- openapi.cue --
package contracts

new: cloud: components: schemas: OnboardingResponse: cloudschemas.OnboardingResponse

cloudschemas: OnboardingResponse: properties: bucket: *commonschemas.Bucket.#Ref | commonschemas.Bucket

old: cloud: components: schemas: OnboardingResponse: properties: bucket: properties: retentionRules: {
	items: type: "object"
}

commonschemas: Bucket: {
	#Ref: $ref:                 "x"
	properties: retentionRules: commonschemas.RetentionRules
}

commonschemas: RetentionRules: {
	#Ref: $ref:  "y"
	items: $ref: "foo"
}

#New: new
#Old: old

#Both: #Old & #New

I see this output:

#Both.cloud.components.schemas.OnboardingResponse.properties.bucket: 2 errors in empty disjunction:
#Both.cloud.components.schemas.OnboardingResponse.properties.bucket: field not allowed: properties:
    ./openapi.cue:3:54
    ./openapi.cue:5:56
    ./openapi.cue:7:74
    ./openapi.cue:12:8
    ./openapi.cue:21:7
    ./openapi.cue:22:7
    ./openapi.cue:24:8
    ./openapi.cue:24:15
#Both.cloud.components.schemas.OnboardingResponse.properties.bucket.properties.retentionRules.items: field not allowed: type:
    ./openapi.cue:3:54
    ./openapi.cue:5:84
    ./openapi.cue:8:9
    ./openapi.cue:13:30
    ./openapi.cue:18:9
    ./openapi.cue:21:7
    ./openapi.cue:22:7
    ./openapi.cue:24:8
    ./openapi.cue:24:15
Nov
19
1 week ago
Activity icon
issue

rogpeppe issue comment golang/go

rogpeppe
rogpeppe

proposal: spec: add sum types / discriminated unions

This is a proposal for sum types, also known as discriminated unions. Sum types in Go should essentially act like interfaces, except that:

  • they are value types, like structs
  • the types contained in them are fixed at compile-time

Sum types can be matched with a switch statement. The compiler checks that all variants are matched. Inside the arms of the switch statement, the value can be used as if it is of the variant that was matched.

rogpeppe
rogpeppe

If it's not using the fully-qualified name, what is it encoding instead?

It could use a name that it finds by calling a method on type (you'd need to make that method work on the zero value of each type, but that might not be too much to ask).

Activity icon
issue

rogpeppe issue comment golang/go

rogpeppe
rogpeppe

proposal: spec: add sum types / discriminated unions

This is a proposal for sum types, also known as discriminated unions. Sum types in Go should essentially act like interfaces, except that:

  • they are value types, like structs
  • the types contained in them are fixed at compile-time

Sum types can be matched with a switch statement. The compiler checks that all variants are matched. Inside the arms of the switch statement, the value can be used as if it is of the variant that was matched.

rogpeppe
rogpeppe

Isn't #36345 a problem specifically for gob because it uses Go type names on the wire? I don't think that's a great idea in general, and it doesn't seem necessary to me.

FWIW I'd expect the API for enumerating the members to report the fully qualified type without aliases, just like all the other reflect APIs.

Activity icon
created branch

rogpeppe in influxdata/openapi create branch use-cue

createdAt 1 week ago
Activity icon
issue

rogpeppe issue 9fans/go

rogpeppe
rogpeppe

cmd/acme: ^D doesn't work in win

To repo, in a win window do:

% cat > /tmp/foo
something
^D

The cat never receives EOF and the ^D character goes into the file.

Previous