Skip to content

Conversation

@pboling
Copy link
Contributor

@pboling pboling commented Jan 4, 2026

This PR updates ruby_tree_sitter to support the tree-sitter 0.26.x C library, enabling
grammars with LANGUAGE_VERSION 15 to be loaded and used.

Changes

ext/tree_sitter/language.c

  • Replace ts_language_version() with ts_language_abi_version() (the new API function)
  • LANGUAGE_VERSION constant updated from 14 to 15 (defined by tree-sitter)
  • MIN_COMPATIBLE_LANGUAGE_VERSION updated from 13 (was 6) - this is defined by tree-sitter,
    meaning grammars built with very old tree-sitter CLI versions are no longer supported

ext/tree_sitter/encoding.c

  • Replace TSInputEncodingUTF16 with TSInputEncodingUTF16LE (the new default for
    little-endian systems, which is the common case)

ext/tree_sitter/parser.c

  • Remove parser_get_timeout_micros() and parser_set_timeout_micros() methods
    (the underlying C functions were removed in tree-sitter 0.26)
  • Remove parser_get_cancellation_flag() and parser_set_cancellation_flag() methods
    (the underlying C functions were removed in tree-sitter 0.26)

ext/tree_sitter/extconf.rb

  • Update TREE_SITTER_VERSION constant to "0.26.3"
  • The C library source is downloaded and compiled during gem installation

Breaking Changes

  • Ruby API: The following methods are removed from TreeSitter::Parser:

    • timeout_micros / timeout_micros=
    • cancellation_flag / cancellation_flag=

    These were deprecated in tree-sitter 0.25 and removed in 0.26. Users should use
    alternative cancellation mechanisms (e.g., Ruby's Timeout module).

Backward Compatibility

  • Grammars with LANGUAGE_VERSION 13-15 are supported (tree-sitter 0.26.3 constraint)
  • Grammars with LANGUAGE_VERSION < 13 are rejected by tree-sitter 0.26.x
  • If you need to use older grammars, they must be regenerated with a recent tree-sitter CLI

Testing

# Verify LANGUAGE_VERSION support
require 'tree_sitter'
puts TreeSitter::LANGUAGE_VERSION        # => 15
puts TreeSitter::MIN_COMPATIBLE_LANGUAGE_VERSION  # => 13

# Load a LANGUAGE_VERSION 15 grammar
lang = TreeSitter::Language.load("tree_sitter_toml", "/path/to/libtree-sitter-toml.so")
puts lang.abi_version  # => 15

# Parse content
parser = TreeSitter::Parser.new
parser.language = lang
tree = parser.parse_string(nil, 'key = "value"')
puts tree.root_node.type  # => "document"

Copilot AI review requested due to automatic review settings January 4, 2026 08:41
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the tree-sitter C library from v0.24.7 to v0.26.3, adapting to significant API changes in the 0.26.x release while maintaining backward compatibility where possible.

Key changes:

  • Migrated from ts_language_version to ts_language_abi_version API
  • Handled the split of TSInputEncodingUTF16 into UTF16LE and UTF16BE variants (defaulting to UTF16LE for backward compatibility)
  • Adapted to the removal of cancellation flag and timeout APIs by making them no-ops while preserving their Ruby interfaces

Reviewed changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
lib/tree_sitter/version.rb Updates TREESITTER_VERSION from 0.24.7 to 0.26.3
ext/tree_sitter/parser.c Makes cancellation_flag and timeout_micros methods no-ops for backward compatibility after API removal
ext/tree_sitter/language.c Migrates to ts_language_abi_version, improves dlopen/dlsym error handling, adds const correctness to typedef, and adds abi_version alias method
ext/tree_sitter/encoding.c Maps UTF16 encoding to UTF16LE for backward compatibility after encoding enum split
News.md Documents API changes and breaking changes for tree-sitter 0.26.3 compatibility
.gitignore Adds build artifacts patterns for compiled extensions

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@pboling
Copy link
Contributor Author

pboling commented Jan 5, 2026

I'll look into these CI failures today.

@stackmystack
Copy link
Collaborator

Thanks for the submission!

I'll try to take a look at the failures too in the upcoming days … I'm a little busy these days, sorry.

If you managed to fix the CI before I intervene it might go faster :)

@stackmystack
Copy link
Collaborator

Anyone knows how can I turn off automatic slop review ? No wonder zig people left github, none of github support instructions helped.

@pboling
Copy link
Contributor Author

pboling commented Jan 6, 2026

I think it must be in project settings somewhere...

@stackmystack
Copy link
Collaborator

I think it must be in project settings somewhere...

I see you didn't change parsers.toml: it's used in tests

You will need to bump them to the latest versions that work with the latest tree-sitter.

@stackmystack
Copy link
Collaborator

stackmystack commented Jan 7, 2026

I think it must be in project settings somewhere...

I see you didn't change parsers.toml: it's used in tests

You will need to bump them to the latest versions that work with the latest tree-sitter.

It's used by tsdl to download and build those parsers (there's a major rework I am doing now on tsdl coming very soon that adds caching and whatnot but the current release version should work normally by changing the requested tree-sitter version in parsers.toml).

@pboling
Copy link
Contributor Author

pboling commented Jan 7, 2026

@stackmystack As an aside on tsdl - it is very cool!

I built a similar tool GitHub Action, because I am doing a lot with different parsers: https://github.com/kettle-rb/ts-grammar-action

If what I built there could leverage what you have in tsdl, I'd love to learn how I could built it on top!

I'll be using it to install tree-sitter and many grammars in my ast-merge suite of ruby gems - which are for providing a normalized merging DSL across all languages and grammars.

@stackmystack
Copy link
Collaborator

stackmystack commented Jan 8, 2026

@stackmystack As an aside on tsdl - it is very cool!

Thanks!

I built a similar tool GitHub Action, because I am doing a lot with different parsers: https://github.com/kettle-rb/ts-grammar-action

If what I built there could leverage what you have in tsdl, I'd love to learn how I could built it on top!

I skimmed through the action, I see that there's a lot of shell scripting to download and compile those parsers. I made tsdl because I literally had almost the same script used in different settings: local dev, CI, and then for packaging and installing as done here.

I am sure you can leverage tsdl, especially that it gives you the possibility to build quirky parsers like php (which has 2 parsers php/php and php/php_only) and typescript (if I'm not mistaken, which should be built with make because of how the grammars are set up) … which complicates the bash script.

I'll be using it to install tree-sitter and many grammars in my ast-merge suite of ruby gems - which are for providing a normalized merging DSL across all languages and grammars.

That's awesome! tsdl works great, but it was a learning experience for me, wanting to learn async rust (even if it's an overkill) and trying many different ideas, no matter how unnecessarily complicated they are. But in 1.5.0, as it stands, I still have an issue with caching, which makes it really annoying when testing or when you have scripts that maintain your env … it will rebuild regardless of what happens.

In the 2.0.0 version, which I'm planning on releasing asap, I added caching and build-dir locking, and I also re-wrote the internals so that it runs concurrent builds with a single-threaded async runtime instead of the needless multithreaded one. Oh and the installed parsers are now hard-links instead of copies which also saves some KBs on disks :)

But I don't understand what's the use-case for ast-merge. I see plenty of different parsers but some of them are pure-ruby parsers (like prism) … So I'm confused as to where is it useful …

I also saw another project referenced, appraisal2 … Is it doing what I think it does? It actually activates gemfiles according to the env? If so, then I think I might be using it. I co-maintain arel-extensions, and we do have plenty of gemfiles we need to test under different conditions, and I also spent quite some time unifying the CI and the dev env, but I still have issues with bundler and trying to install and work with gemfiles under different installed (in the same docker image) rubies, sometimes it just loses its mind as to what gems are activated or not … does appraisal2 address such concerns ?

@pboling
Copy link
Contributor Author

pboling commented Jan 8, 2026

@stackmystack

It actually activates gemfiles according to the env?

Yes, appraisal2 basically manages the bundler env, allowing for a gemfile-per-scenario approach. I'll leave an example at bottom.

I enhanced the original, almost-dead, project from Thoughtbot, adding support for more of bundler's sugar, like eval_gemfile, and with that I am able to build a suite of "composable" or "modular" gemfiles that I then use in every project.

And that is actually what ast-merge is for! I use a library kettle-dev which is a gem templater, allowing me to share a large amount of boilerplate across all my gems, but also allow it to be customized per gem, and allow merge updates over time. It is these repeated merge updates that the ast-merge suite is for. It allows me to build document merging recipes (as YAML) that dictate how to merge specific templates, or partial templates, into destination documents (e.g. a Gem's README.md, or per-scenario-modular-gemfiles).

Appraisal2 Example

I have my CI workflows separated because some gems for some use cases will not install on all platforms, and I don't need them to. I only need to run style checks on one platform, for example.

.github/workflows/style.yaml

name: Style

permissions:
  contents: read

on:
  push:
    branches:
      - 'main'
      - '*-stable'
    tags:
      - '!*' # Do not execute on tags
  pull_request:
    branches:
      - '*'
  # Allow manually triggering the workflow.
  workflow_dispatch:

# Cancels all previous workflow runs for the same branch that have not yet completed.
concurrency:
  # The concurrency group contains the workflow name and the branch name.
  group: "${{ github.workflow }}-${{ github.ref }}"
  cancel-in-progress: true

jobs:
  rubocop:
    if: "!contains(github.event.commits[0].message, '[ci skip]') && !contains(github.event.commits[0].message, '[skip ci]')"
    name: Style on ${{ matrix.ruby }}@current
    runs-on: ubuntu-latest
    env: # $BUNDLE_GEMFILE must be set at job level, so it is set for all steps
      BUNDLE_GEMFILE: ${{ github.workspace }}/${{ matrix.gemfile }}.gemfile
    strategy:
      fail-fast: false
      matrix:
        include:
          # Style
          - ruby: "ruby"
            appraisal: "style"
            exec_cmd: "rake rubocop_gradual:check"
            gemfile: "Appraisal.root"
            rubygems: latest
            bundler: latest

    steps:
      - name: Checkout
        uses: actions/checkout@v6

      - name: Setup Ruby & RubyGems
        uses: ruby/setup-ruby@v1
        with:
          ruby-version: ${{ matrix.ruby }}
          rubygems: ${{ matrix.rubygems }}
          bundler: ${{ matrix.bundler }}
          bundler-cache: true

      # Raw `bundle` will use the BUNDLE_GEMFILE set to matrix.gemfile (i.e. Appraisal.root)
      # We need to do this first to get appraisal installed.
      # NOTE: This does not use the primary Gemfile at all.
      - name: Install Root Appraisal
        run: bundle install
      - name: Appraisal for ${{ matrix.appraisal }}
        run: bundle exec appraisal ${{ matrix.appraisal }} bundle
      - name: Run ${{ matrix.appraisal }} checks via ${{ matrix.exec_cmd }}
        run: bundle exec appraisal ${{ matrix.appraisal }} bundle exec ${{ matrix.exec_cmd }}
      - name: Validate RBS Types
        run: bundle exec appraisal ${{ matrix.appraisal }} bin/rbs validate

gemfiles/modular/style.gemfile

Note: I use platform :mri to guard some gems here because I re-use this style gemfile in my main Gemfile, which also supports JRuby and Truffleruby locally.

gem "reek", "~> 6.5"

platform :mri do
  # gem "rubocop", "~> 1.73", ">= 1.73.2" # constrained by standard
  gem "rubocop-packaging", "~> 0.6", ">= 0.6.0"
  gem "standard", ">= 1.50"
  gem "rubocop-on-rbs", "~> 1.8" # ruby >= 3.1.0

  if ENV.fetch("RUBOCOP_LTS_LOCAL", "false").casecmp("true").zero?
    home = ENV["HOME"] || Dir.home
    gem "rubocop-lts", path: "#{home}/src/rubocop-lts/rubocop-lts"
    gem "rubocop-lts-rspec", path: "#{home}/src/rubocop-lts/rubocop-lts-rspec"
    gem "rubocop-ruby3_2", path: "#{home}/src/rubocop-lts/rubocop-ruby3_2"
    gem "standard-rubocop-lts", path: "#{home}/src/rubocop-lts/standard-rubocop-lts"
  else
    gem "rubocop-lts", "~> 24.0"
    gem "rubocop-rspec", "~> 3.6"
    gem "rubocop-ruby3_2"
  end
end

Appraisal.root.gemfile

source "https://gem.coop"

# Appraisal Root Gemfile is for running appraisal to generate the Appraisal Gemfiles
#   in gemfiles/*gemfile.
# On CI, we use it for the Appraisal-based builds.
# We do not load the standard Gemfile, as it is tailored for local development.

gemspec

Appraisals

# Only run linter on the latest version of Ruby (but, in support of oldest supported Ruby version)
appraise "style" do
  eval_gemfile "modular/style.gemfile" # <================ reused!
  eval_gemfile "modular/x_std_libs.gemfile"
end

# ... many other bundler scenarios

Gemfile (the canonical main one for local dev)

source "https://gem.coop"

git_source(:codeberg) { |repo_name| "https://codeberg.org/#{repo_name}" }
git_source(:gitlab) { |repo_name| "https://gitlab.com/#{repo_name}" }

# Specify your gem's dependencies in psych-merge.gemspec
gemspec

# runtime dependencies that we can't add to gemspec due to platform differences
eval_gemfile "gemfiles/modular/tree_sitter.gemfile"

# optional templating dependencies
eval_gemfile "gemfiles/modular/templating.gemfile"

eval_gemfile "gemfiles/modular/debug.gemfile"
eval_gemfile "gemfiles/modular/coverage.gemfile"
eval_gemfile "gemfiles/modular/style.gemfile" # <================ reused!
# ... many more

All of the files above can be found in the mentioned locations in ast-merge or tree_haver, and many of my other gems. In addition to the style scenario, I have composable gemfiles for:

  • coverage
  • debug
  • documentation
  • rspec
  • runtime_heads (the git source checkout of every dependency, so I can test my library against the HEAD version of every library it depends on)
  • x_std_libs (composable, incrementally-versioned-per-minor-version-of-ruby, ex-standard-lib gem dependencies)
  • and usually a few others that are specific to a project.

@stackmystack stackmystack mentioned this pull request Jan 9, 2026
@stackmystack
Copy link
Collaborator

Thanks you for your help, I will take a look asap.

Meanwhile, I took your branch and added commits on top of it: #96.

I'd still like to keep the PR here open, so please review it, if there's any other APIs missing that you'd like to add to this PR, I say we keep it for later.

If you like, you can pick my commits, add them to your branch, fixup the 1st commit on top of yours because it should be an atomic commit (and please follow my convention in commits), and then we're good to go (but ofc let's monitor CI :) ).

I'm only saying that to keep the PR attribution to you, not to give you work :) If you prefer, I can close here and merge from the other side.

@pboling
Copy link
Contributor Author

pboling commented Jan 9, 2026

@stackmystack cherry-picked :)

@pboling
Copy link
Contributor Author

pboling commented Jan 9, 2026

Now I'll update my first commit to the conventional commit style you use

pboling and others added 5 commits January 9, 2026 15:01
BREAKING CHANGE:
  - `ts_language_version` renamed to `ts_language_abi_version`
  - `TSInputEncodingUTF16` split into `TSInputEncodingUTF16LE` and `TSInputEncodingUTF16BE`
    (now using UTF16LE as default for backward compatibility)
  - Cancellation flag API (`ts_parser_cancellation_flag`, `ts_parser_set_cancellation_flag`) removed
    - `Parser#cancellation_flag` and `Parser#cancellation_flag=` are now no-ops for backward compatibility
  - Timeout API (`ts_parser_timeout_micros`, `ts_parser_set_timeout_micros`) removed
    - `Parser#timeout_micros` and `Parser#timeout_micros=` are now no-ops for backward compatibility
  - Use `TSParseOptions` with `progress_callback` for cancellation/timeout functionality in 0.26+
- `TREE_SITTER_LANGUAGE_VERSION` is now 15 (was 14)
- `TREE_SITTER_MIN_COMPATIBLE_LANGUAGE_VERSION` is now 13 (was 6)
- Grammar files (.so) must be built against tree-sitter 0.26+ to work with this version

Closes Faveod#94
It's depricated since a long time, and there's not reason for it to
stay.
Remove unnecessary declarations
@pboling pboling force-pushed the feat/upgrade-tree-sitter-0.26 branch from 973a5d4 to de72490 Compare January 9, 2026 22:05
@pboling
Copy link
Contributor Author

pboling commented Jan 9, 2026

@stackmystack First commit fixed up. I think this is ready

@stackmystack
Copy link
Collaborator

@pboling I merged the other MR, sorry for the inconvenice, it was simpler. There were other changes to make, and the CI is obscurely dying on ruby 3.1 + maxos when we enable sys libs, and I couldn't figure out why.

Thanks for your contribution!

@pboling
Copy link
Contributor Author

pboling commented Jan 14, 2026

Not super important @stackmystack but your PR did not have this very small change: 4d869d6 (#95)

@stackmystack
Copy link
Collaborator

Not super important @stackmystack but your PR did not have this very small change: 4d869d6 (#95)

Thanks. I pushed it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants