-
Notifications
You must be signed in to change notification settings - Fork 12
👽️ Support tree-sitter 0.26.x C library API #95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR updates the tree-sitter C library from v0.24.7 to v0.26.3, adapting to significant API changes in the 0.26.x release while maintaining backward compatibility where possible.
Key changes:
- Migrated from
ts_language_versiontots_language_abi_versionAPI - Handled the split of
TSInputEncodingUTF16into UTF16LE and UTF16BE variants (defaulting to UTF16LE for backward compatibility) - Adapted to the removal of cancellation flag and timeout APIs by making them no-ops while preserving their Ruby interfaces
Reviewed changes
Copilot reviewed 5 out of 6 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| lib/tree_sitter/version.rb | Updates TREESITTER_VERSION from 0.24.7 to 0.26.3 |
| ext/tree_sitter/parser.c | Makes cancellation_flag and timeout_micros methods no-ops for backward compatibility after API removal |
| ext/tree_sitter/language.c | Migrates to ts_language_abi_version, improves dlopen/dlsym error handling, adds const correctness to typedef, and adds abi_version alias method |
| ext/tree_sitter/encoding.c | Maps UTF16 encoding to UTF16LE for backward compatibility after encoding enum split |
| News.md | Documents API changes and breaking changes for tree-sitter 0.26.3 compatibility |
| .gitignore | Adds build artifacts patterns for compiled extensions |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
I'll look into these CI failures today. |
|
Thanks for the submission! I'll try to take a look at the failures too in the upcoming days … I'm a little busy these days, sorry. If you managed to fix the CI before I intervene it might go faster :) |
|
Anyone knows how can I turn off automatic slop review ? No wonder zig people left github, none of github support instructions helped. |
|
I think it must be in project settings somewhere... |
I see you didn't change You will need to bump them to the latest versions that work with the latest tree-sitter. |
It's used by |
|
@stackmystack As an aside on tsdl - it is very cool! I built a similar tool GitHub Action, because I am doing a lot with different parsers: https://github.com/kettle-rb/ts-grammar-action If what I built there could leverage what you have in tsdl, I'd love to learn how I could built it on top! I'll be using it to install tree-sitter and many grammars in my ast-merge suite of ruby gems - which are for providing a normalized merging DSL across all languages and grammars. |
Thanks!
I skimmed through the action, I see that there's a lot of shell scripting to download and compile those parsers. I made tsdl because I literally had almost the same script used in different settings: local dev, CI, and then for packaging and installing as done here. I am sure you can leverage tsdl, especially that it gives you the possibility to build quirky parsers like php (which has 2 parsers php/php and php/php_only) and typescript (if I'm not mistaken, which should be built with
That's awesome! tsdl works great, but it was a learning experience for me, wanting to learn async rust (even if it's an overkill) and trying many different ideas, no matter how unnecessarily complicated they are. But in 1.5.0, as it stands, I still have an issue with caching, which makes it really annoying when testing or when you have scripts that maintain your env … it will rebuild regardless of what happens. In the 2.0.0 version, which I'm planning on releasing asap, I added caching and build-dir locking, and I also re-wrote the internals so that it runs concurrent builds with a single-threaded async runtime instead of the needless multithreaded one. Oh and the installed parsers are now hard-links instead of copies which also saves some KBs on disks :) But I don't understand what's the use-case for I also saw another project referenced, |
Yes, appraisal2 basically manages the bundler env, allowing for a gemfile-per-scenario approach. I'll leave an example at bottom. I enhanced the original, almost-dead, project from Thoughtbot, adding support for more of bundler's sugar, like And that is actually what Appraisal2 ExampleI have my CI workflows separated because some gems for some use cases will not install on all platforms, and I don't need them to. I only need to run style checks on one platform, for example. .github/workflows/style.yaml name: Style
permissions:
contents: read
on:
push:
branches:
- 'main'
- '*-stable'
tags:
- '!*' # Do not execute on tags
pull_request:
branches:
- '*'
# Allow manually triggering the workflow.
workflow_dispatch:
# Cancels all previous workflow runs for the same branch that have not yet completed.
concurrency:
# The concurrency group contains the workflow name and the branch name.
group: "${{ github.workflow }}-${{ github.ref }}"
cancel-in-progress: true
jobs:
rubocop:
if: "!contains(github.event.commits[0].message, '[ci skip]') && !contains(github.event.commits[0].message, '[skip ci]')"
name: Style on ${{ matrix.ruby }}@current
runs-on: ubuntu-latest
env: # $BUNDLE_GEMFILE must be set at job level, so it is set for all steps
BUNDLE_GEMFILE: ${{ github.workspace }}/${{ matrix.gemfile }}.gemfile
strategy:
fail-fast: false
matrix:
include:
# Style
- ruby: "ruby"
appraisal: "style"
exec_cmd: "rake rubocop_gradual:check"
gemfile: "Appraisal.root"
rubygems: latest
bundler: latest
steps:
- name: Checkout
uses: actions/checkout@v6
- name: Setup Ruby & RubyGems
uses: ruby/setup-ruby@v1
with:
ruby-version: ${{ matrix.ruby }}
rubygems: ${{ matrix.rubygems }}
bundler: ${{ matrix.bundler }}
bundler-cache: true
# Raw `bundle` will use the BUNDLE_GEMFILE set to matrix.gemfile (i.e. Appraisal.root)
# We need to do this first to get appraisal installed.
# NOTE: This does not use the primary Gemfile at all.
- name: Install Root Appraisal
run: bundle install
- name: Appraisal for ${{ matrix.appraisal }}
run: bundle exec appraisal ${{ matrix.appraisal }} bundle
- name: Run ${{ matrix.appraisal }} checks via ${{ matrix.exec_cmd }}
run: bundle exec appraisal ${{ matrix.appraisal }} bundle exec ${{ matrix.exec_cmd }}
- name: Validate RBS Types
run: bundle exec appraisal ${{ matrix.appraisal }} bin/rbs validategemfiles/modular/style.gemfile Note: I use gem "reek", "~> 6.5"
platform :mri do
# gem "rubocop", "~> 1.73", ">= 1.73.2" # constrained by standard
gem "rubocop-packaging", "~> 0.6", ">= 0.6.0"
gem "standard", ">= 1.50"
gem "rubocop-on-rbs", "~> 1.8" # ruby >= 3.1.0
if ENV.fetch("RUBOCOP_LTS_LOCAL", "false").casecmp("true").zero?
home = ENV["HOME"] || Dir.home
gem "rubocop-lts", path: "#{home}/src/rubocop-lts/rubocop-lts"
gem "rubocop-lts-rspec", path: "#{home}/src/rubocop-lts/rubocop-lts-rspec"
gem "rubocop-ruby3_2", path: "#{home}/src/rubocop-lts/rubocop-ruby3_2"
gem "standard-rubocop-lts", path: "#{home}/src/rubocop-lts/standard-rubocop-lts"
else
gem "rubocop-lts", "~> 24.0"
gem "rubocop-rspec", "~> 3.6"
gem "rubocop-ruby3_2"
end
endAppraisal.root.gemfile source "https://gem.coop"
# Appraisal Root Gemfile is for running appraisal to generate the Appraisal Gemfiles
# in gemfiles/*gemfile.
# On CI, we use it for the Appraisal-based builds.
# We do not load the standard Gemfile, as it is tailored for local development.
gemspecAppraisals # Only run linter on the latest version of Ruby (but, in support of oldest supported Ruby version)
appraise "style" do
eval_gemfile "modular/style.gemfile" # <================ reused!
eval_gemfile "modular/x_std_libs.gemfile"
end
# ... many other bundler scenariosGemfile (the canonical main one for local dev) source "https://gem.coop"
git_source(:codeberg) { |repo_name| "https://codeberg.org/#{repo_name}" }
git_source(:gitlab) { |repo_name| "https://gitlab.com/#{repo_name}" }
# Specify your gem's dependencies in psych-merge.gemspec
gemspec
# runtime dependencies that we can't add to gemspec due to platform differences
eval_gemfile "gemfiles/modular/tree_sitter.gemfile"
# optional templating dependencies
eval_gemfile "gemfiles/modular/templating.gemfile"
eval_gemfile "gemfiles/modular/debug.gemfile"
eval_gemfile "gemfiles/modular/coverage.gemfile"
eval_gemfile "gemfiles/modular/style.gemfile" # <================ reused!
# ... many moreAll of the files above can be found in the mentioned locations in ast-merge or tree_haver, and many of my other gems. In addition to the
|
|
Thanks you for your help, I will take a look asap. Meanwhile, I took your branch and added commits on top of it: #96. I'd still like to keep the PR here open, so please review it, if there's any other APIs missing that you'd like to add to this PR, I say we keep it for later. If you like, you can pick my commits, add them to your branch, fixup the 1st commit on top of yours because it should be an atomic commit (and please follow my convention in commits), and then we're good to go (but ofc let's monitor CI :) ). I'm only saying that to keep the PR attribution to you, not to give you work :) If you prefer, I can close here and merge from the other side. |
|
@stackmystack cherry-picked :) |
|
Now I'll update my first commit to the conventional commit style you use |
BREAKING CHANGE:
- `ts_language_version` renamed to `ts_language_abi_version`
- `TSInputEncodingUTF16` split into `TSInputEncodingUTF16LE` and `TSInputEncodingUTF16BE`
(now using UTF16LE as default for backward compatibility)
- Cancellation flag API (`ts_parser_cancellation_flag`, `ts_parser_set_cancellation_flag`) removed
- `Parser#cancellation_flag` and `Parser#cancellation_flag=` are now no-ops for backward compatibility
- Timeout API (`ts_parser_timeout_micros`, `ts_parser_set_timeout_micros`) removed
- `Parser#timeout_micros` and `Parser#timeout_micros=` are now no-ops for backward compatibility
- Use `TSParseOptions` with `progress_callback` for cancellation/timeout functionality in 0.26+
- `TREE_SITTER_LANGUAGE_VERSION` is now 15 (was 14)
- `TREE_SITTER_MIN_COMPATIBLE_LANGUAGE_VERSION` is now 13 (was 6)
- Grammar files (.so) must be built against tree-sitter 0.26+ to work with this version
Closes Faveod#94
It's depricated since a long time, and there's not reason for it to stay.
Remove unnecessary declarations
973a5d4 to
de72490
Compare
|
@stackmystack First commit fixed up. I think this is ready |
… the return type of `make_ts_language()`
|
@pboling I merged the other MR, sorry for the inconvenice, it was simpler. There were other changes to make, and the CI is obscurely dying on ruby 3.1 + maxos when we enable sys libs, and I couldn't figure out why. Thanks for your contribution! |
|
Not super important @stackmystack but your PR did not have this very small change: |
Thanks. I pushed it here. |
This PR updates ruby_tree_sitter to support the tree-sitter 0.26.x C library, enabling
grammars with
LANGUAGE_VERSION 15to be loaded and used.Changes
ext/tree_sitter/language.c
ts_language_version()withts_language_abi_version()(the new API function)LANGUAGE_VERSIONconstant updated from 14 to 15 (defined by tree-sitter)MIN_COMPATIBLE_LANGUAGE_VERSIONupdated from 13 (was 6) - this is defined by tree-sitter,meaning grammars built with very old tree-sitter CLI versions are no longer supported
ext/tree_sitter/encoding.c
TSInputEncodingUTF16withTSInputEncodingUTF16LE(the new default forlittle-endian systems, which is the common case)
ext/tree_sitter/parser.c
parser_get_timeout_micros()andparser_set_timeout_micros()methods(the underlying C functions were removed in tree-sitter 0.26)
parser_get_cancellation_flag()andparser_set_cancellation_flag()methods(the underlying C functions were removed in tree-sitter 0.26)
ext/tree_sitter/extconf.rb
TREE_SITTER_VERSIONconstant to"0.26.3"Breaking Changes
Ruby API: The following methods are removed from
TreeSitter::Parser:timeout_micros/timeout_micros=cancellation_flag/cancellation_flag=These were deprecated in tree-sitter 0.25 and removed in 0.26. Users should use
alternative cancellation mechanisms (e.g., Ruby's Timeout module).
Backward Compatibility
LANGUAGE_VERSION13-15 are supported (tree-sitter 0.26.3 constraint)LANGUAGE_VERSION< 13 are rejected by tree-sitter 0.26.xTesting