Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Determine comment tokens from injection layers #12759

Open
wants to merge 41 commits into
base: master
Choose a base branch
from

Conversation

nik-rev
Copy link
Contributor

@nik-rev nik-rev commented Feb 2, 2025

Previously, if you had a file like this:

<p>Some text 1234</p>
<script type="text/javascript">
  // bar();
  foo();
</script>

Pressing Space + c (toggle comment) on the JavaScript comment would've used the HTML comment token:

<p>Some text 1234</p>
<script type="text/javascript">
  <!-- // bar(); -->
  foo();
</script>

This PR fixes that. Now, the comment token is properly recognized:

<p>Some test 1234</p>
<script type="text/javascript">
  bar();
  foo();
</script>

It also works for continue comment functionality (when pressing o or adding a newline, for example)

Additionally, the PR adds a new :tree-sitter-injections command:

`:tree-sitter-injections` output

With this file:

<!doctype html>
<html lang="en">
  <head>
    <meta charset="UTF-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <title>Multi-language Example</title>
    <style>
      /* CSS Example */
      body {
        font-family: Arial, sans-serif;
        background-color: #f0f0f0;
        margin: 0;
        padding: 0;
      }
      .container {
        max-width: 800px;
        margin: 20px auto;
        background: #fff;
        border: 1px solid #ddd;
        border-radius: 8px;
        box-shadow: 0 4px 6px rgba(0, 0, 0, 0.1);
        padding: 20px;
      }
      .highlight {
        color: #ff5722;
        font-weight: bold;
      }
    </style>
    ### Limitations If you have a file like this:
  </head>
  <body>
    <div class="container">
      <h1>Welcome to Multi-language HTML</h1>
      <p>
        This is an example of <span class="highlight">embedded languages</span>.
      </p>
      <button id="regexButton">Test Regex</button>
      <pre id="regexResult"></pre>
    </div>
    <script>
      // JavaScript Example with JSDoc and Regex
      /**
       * Validates an email using a regex pattern.
       * @param {string} email - The email to validate.
       * @returns {boolean} True if valid, false otherwise.
       */
      function validateEmail(email) {
        const emailRegex = /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/; // Regex Example
        return emailRegex.test(email);
      }

      // Event Listener Example
      document.getElementById("regexButton").addEventListener("click", () => {
        const testEmails = ["[email protected]", "invalid-email", "user@domain"];
        const results = testEmails.map(
          (email) => `${email}: ${validateEmail(email)}`,
        );
        document.getElementById("regexResult").textContent = results.join("\n");
      });
    </script>
  </body>
</html>

We get the following when we run :tree-sitter-injections:

image

Languages Tested

I've added a lot of tests, each of which actually helped me fix a bug or two while I was implementing this. So I ended up extracting all comment-relating integration tests into a separate module

The core functionality works. What's left is to make sure that we do all we can to have individual languages work as well. For example, Svelte's comment tokens had to be adjusted in this PR for the best experience.

I've tested these languages manually as well. If you have a language which relies on injections, be sure to test it and comment here, I'll add it to the list.

  • html
    • css
    • javascript
  • svelte
    • javascript
    • css
    • typescript
JSX and TSX

These languages don't use tree-sitter injections:

"use client";

import { useState } from "react";

export default function Home() {
  const [clickCount, setClickCount] = useState(0);

  return (
    <div>
      My counter:
      <button onClick={() => setClickCount(4)}>{clickCount}</button>
    </div>
  );
}

Has just this injection:

image

Which means this PR won't affect them
You can use Space + C to comment JSX at the moment. It'd be nice if you could use Space + c to make line comments that use the tokens {/* and */} though, as it uses // right now.

But to do this we could have .tsx and .jsx languages use ts and js as the file's "main" language, and then when we inject tsx and jsx languages when we encounter a tag.

Not exactly sure how to accomplish this though, and it's not really related to this PR (can be done as a follow up if I or someone figures out how to do this)

Closes #7364
Closes #11647
Related to to #9425

@nik-rev nik-rev force-pushed the determine-comment-tokens branch from 63e26be to 3281c81 Compare February 2, 2025 22:36
@nik-rev nik-rev changed the title feat: Determine comment tokens from injection layers fix: Determine comment tokens from injection layers Feb 2, 2025
@nik-rev nik-rev changed the title fix: Determine comment tokens from injection layers feat: Determine comment tokens from injection layers Feb 2, 2025
@nik-rev nik-rev marked this pull request as draft February 2, 2025 23:14
@nik-rev nik-rev marked this pull request as ready for review February 2, 2025 23:14
@nik-rev nik-rev force-pushed the determine-comment-tokens branch from 019fe67 to c585fca Compare February 2, 2025 23:19
@nik-rev nik-rev marked this pull request as draft February 2, 2025 23:23
@nik-rev nik-rev marked this pull request as ready for review February 3, 2025 13:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Comments in JSX/TSX Determine comment tokens from injection layers
1 participant