Update to Bison 3.8.1

11
open
GitMensch
GitMensch
Posted 10 months ago

Update to Bison 3.8.1 #74

Bison 3.7.5 was released:

Noteworthy changes in release 3.7.5 (2021-01-24) [stable]

** Bug fixes

*** Counterexample Generation

In some cases counterexample generation could crash. This is fixed.

*** Fix Table Generation

In some very rare conditions, when there are many useless tokens, it was possible to generate incorrect parsers.

*** GLR parsers now support %merge together with api.value.type=union.

*** C++ parsers use noexcept in more places.

*** Generated parsers avoid some warnings about signedness issues.

*** C-language parsers now avoid warnings from pedantic clang.

*** C-language parsers now work around quirks of HP-UX 11.23 (2003).

** Changes

*** C++ value_type

Prefer value_type to semantic_type to denote the semantic value type, specified by the api.value.type %define variable.

*** GLR traces

There were not debug traces for deferred calls to user actions. They are logged now.

** New features

*** Option -H, --header and directive %header

The option -H/--header supersedes the option --defines, and the directive %header supersedes %defines. Both --defines and %defines are, of course, maintained for backward compatibility.

*** Option --html

Since version 2.4 Bison can be used to generate HTML reports. However it was a two-step process: first bison must be invoked with option --xml, and then xsltproc must be run to the convert the XML reports into HTML.

The new option --html combines these steps. The xsltproc program must be available.

*** A C++ native GLR parser

A new version of the generated C++ GLR parser was added as "glr2.cc". It is forked from the existing glr.c/cc parser, with the objective of making it a more modern, truly C++ parser (instead of a C++ wrapper around a C parser). Down the line, the goal is to support %define api.value.type variant and maybe share code with lalr1.cc.

The current parser should be identical in terms of interface, functionality and performance to "glr.cc". To try it out, simply use

%skeleton "glr2.cc"

*** Counterexamples

Counterexamples now show the rule numbers, and always show ε for rules with an empty right-hand side. For instance

exp
↳ 1: e1       e2     "a"
     ↳ 3: ε • ↳ 1: ε

instead of

exp
↳ e1  e2  "a"
  ↳ • ↳ ε

*** Lookahead correction in Java

The Java skeleton (lalr1.java) now supports LAC, via the parse.lac %define variable.

*** Abort parsing for memory exhaustion (C)

The user actions may now use YYNOMEM to abort the current parse with memory exhaustion.

lexxmark
lexxmark
Created 10 months ago

Thank you for information. Let's return to this issue in summer.

I prefer to make upstream adoptions once a year.

GitMensch
GitMensch
Created 5 months ago

Noteworthy changes in release 3.7.6 (2021-03-08) [stable]

Bug fixes

Reused Push Parsers

When a push-parser state structure is used for multiple parses, it was possible for some state to leak from one run into the following one.

Fix Table Generation

In some very rare conditions, when there are many useless tokens, it was possible to generate incorrect parsers.

Note: the unreleased changes are actually quite long already, so I guess a new version is on its way; in the last year we've seen updates each 2-4 months, so I guess this may be the case here, too - and the nice counterexamples were improved again.

lexxmark
lexxmark
Created 5 months ago

@GitMensch thank you for reminder, it's definitely a time to upgrade win_bison.

lexxmark
lexxmark
Created 5 months ago

As flex release is becoming real now. I suggest to wait a little bit more (say till August) and make an upgrade both bison and flex. Another reason - I will have a vacation in July and don't want to make large changes before leaving.

GitMensch
GitMensch
Created 2 months ago

Bison 3.8.1 released

I'm very pleased to announce the release of Bison 3.8(.1), whose main novelty is the D backend for deterministic parsers, contributed by Adela Vais. It supports all the bells and whistles of Bison's other deterministic parsers, which include: pull/push interfaces, verbose and custom error messages, lookahead correction, LALR(1), IELR(1), canonical LR(1), token constructors, internationalization, locations, printers, token and symbol prefixes, and more.

There are several other notable changes. Please see the detailed NEWS below for more details.

Noteworthy changes in release 3.8.1 (2021-09-11) [stable]

The generation of prototypes for yylex and yyerror in Yacc mode is breaking existing grammar files. To avoid breaking too many grammars, the prototypes are now generated when -y/--yacc is used and the POSIXLY_CORRECT environment variable is defined.

Avoid using -y/--yacc simply to comply with Yacc's file name conventions, rather, use -o y.tab.c. Autoconf's AC_PROG_YACC macro uses -y. Avoid it if possible, for instance by using gnulib's gl_PROG_BISON.

Noteworthy changes in release 3.8 (2021-09-07) [stable]

Backward incompatible changes

In conformance with the recommendations of the Graphviz team (https://marc.info/?l=graphviz-devel&m=129418103126092), -g/--graph now generates a *.gv file by default, instead of *.dot. A transition started in Bison 3.4.

To comply with the latest POSIX standard, in Yacc compatibility mode (options -y/--yacc) Bison now generates prototypes for yyerror and yylex. In some situations, this is breaking compatibility: if the user has already declared these functions but with some differences (e.g., to declare them as static, or to use specific attributes), the generated parser will fail to compile. To disable these prototypes, #define yyerror (to yyerror), and likewise for yylex.

Deprecated features

Support for the YYPRINT macro is removed. It worked only with yacc.c and only for tokens. It was obsoleted by %printer, introduced in Bison 1.50 (November 2002).

It has always been recommended to prefer %define api.value.type foo to #define YYSTYPE foo. The latter is supported in C for compatibility with Yacc, but not in C++. Warnings are now issued if #define YYSTYPE is used in C++, and eventually support will be removed.

In C++ code, prefer value_type to semantic_type to denote the semantic value type, which is specified by the api.value.type %define variable.

New features

A skeleton for the D programming language

The "lalr1.d" skeleton is now officially part of Bison.

It was originally contributed by Oliver Mangold, based on Paolo Bonzini's lalr1.java, and was improved by H. S. Teoh. Adela Vais then took over maintenance and invested a lot of efforts to complete, test and document it.

It now supports all the bells and whistles of the other deterministic parsers, which include: pull/push interfaces, verbose and custom error messages, lookahead correction, token constructors, internationalization, locations, printers, token and symbol prefixes, etc.

Two examples demonstrate the D parsers: a basic one (examples/d/simple), and an advanced one (examples/d/calc).

Option -H, --header and directive %header

The option -H/--header supersedes the option --defines, and the directive %header supersedes %defines. Both --defines and %defines are, of course, maintained for backward compatibility.

Option --html

Since version 2.4 Bison can be used to generate HTML reports. However it was a two-step process: first bison must be invoked with option --xml, and then xsltproc must be run to the convert the XML reports into HTML.

The new option --html combines these steps. The xsltproc program must be available.

A C++ native GLR parser

A new version of the C++ GLR parser was added: "glr2.cc". It generates "true C++11", instead of a C++ wrapper around a C parser as does the existing "glr.cc" parser. As a first significant consequence, it supports %define api.value.type variant, contrary to glr.cc.

It should be upward compatible in terms of interface, feature and performance to "glr.cc". To try it out, simply use

%skeleton "glr2.cc"

It will eventually replace "glr.cc". However we need user feedback on this skeleton. Please report your results and comments about it.

Counterexamples

Counterexamples now show the rule numbers, and always show ε for rules with an empty right-hand side. For instance

exp
↳ 1: e1       e2     "a"
     ↳ 3: ε • ↳ 1: ε

instead of

exp
↳ e1  e2  "a"
  ↳ • ↳ ε

Lookahead correction in Java

The Java skeleton (lalr1.java) now supports LAC, via the parse.lac %define variable.

Abort parsing for memory exhaustion (C)

User actions may now use YYNOMEM (similar to YYACCEPT and YYABORT) to abort the current parse with memory exhaustion.

Printing locations in debug traces (C)

The YYLOCATION_PRINT(File, Loc) macro prints a location. It is defined when (i) locations are enabled, (ii) the default type for locations is used, (iii) debug traces are enabled, and (iv) YYLOCATION_PRINT is not already defined.

Users may define YYLOCATION_PRINT to cover other cases.

GLR traces

There were no debug traces for deferred calls to user actions. They are logged now.

While there were more changes in flex since the last post there is still no new release - I suggest to pull the bison trigger and do a release now and then an update in N months when/if there is a flex release.

madebr
madebr
Created 2 months ago

Additionally, earlier this year, m4 received an update to 1.4.19.

NEWS

  • Noteworthy changes in release 1.4.19 (2021-05-28) [stable]

** A number of portability improvements inherited from gnulib, including the ability to perform stack overflow detection on more platforms without linking to GNU libsigsegv.

  • Noteworthy changes in release 1.4.18d (2021-05-11) [beta]

** A number of portability improvements inherited from gnulib.

  • Noteworthy changes in release 1.4.18b (2021-05-07) [beta]

** The symbol hash table now defaults to 65537 buckets instead of 509, as modern systems have enough memory to benefit from fewer hash collisions by default.

** Introduce the use of gettext, with the immediate benefit of nicer UTF-8 author names. Over time, more translations of program messages will become available.

** A number of portability improvements inherited from gnulib.

lexxmark
lexxmark
Created 2 months ago

Additionally, earlier this year, m4 received an update to 1.4.19.

Tried to adopt new M4 code but failed.

It seems it supports WIN32 compilation but for example uses sigset_t type in asyncsafe-spin.h file.

@madebr or @GitMensch Could you configure M4 code under windows, so I will try to adopt the code generated for windows?

madebr
madebr
Created 2 months ago

I've created a m4/1.4.19 recipe for conan. This uses the autotools build system. Perhaps you can adopt its generated sources/headers?

When enabling debug runtime (MTd/MDd), it runs fine when built with MSVC2017. When building with MSVC2019, an assertion error is always thrown.

lexxmark
lexxmark
Created 2 months ago

I've created a m4/1.4.19 recipe for conan. This uses the autotools build system. Perhaps you can adopt its generated sources/headers?

@madebr Could you please give me a direct link to the sources/headers?

madebr
madebr
Created 2 months ago

I've attached the source, build and package folder of a conan build of m4/1.4.19. The source folder contains the unmodified m4 sources + conan patches The build folder contains the patched m4 sources + build artifacts (.obj/.exe + gnulib generated headers) The package folder contains only the .exe + license.

m4_1.4.19.zip

You can reproduce it by running the following commands in cmd.exe:

python -m pip install --user conan
conan profile new --detect
conan profile show default  # Should contain "compiler=Visual Studio" and "compiler.version=16"
conan install m4/[email protected] --build m4

The default conan cache is located at %HOME%\.conan. m4 can be found at %HOME%\.conan\data\m4\1.4.19\_\_.

Previous