This template provides a minimal setup to get React working in Vite with HMR and some ESLint rules.
Currently, two official plugins are available:
- @vitejs/plugin-react uses Babel (or oxc when used in rolldown-vite) for Fast Refresh
- @vitejs/plugin-react-swc uses SWC for Fast Refresh
The React Compiler is not enabled on this template because of its impact on dev & build performances. To add it, see this documentation.
If you are developing a production application, we recommend updating the configuration to enable type-aware lint rules:
export default defineConfig([
globalIgnores(['dist']),
{
files: ['**/*.{ts,tsx}'],
extends: [
// Other configs...
// Remove tseslint.configs.recommended and replace with this
tseslint.configs.recommendedTypeChecked,
// Alternatively, use this for stricter rules
tseslint.configs.strictTypeChecked,
// Optionally, add this for stylistic rules
tseslint.configs.stylisticTypeChecked,
// Other configs...
],
languageOptions: {
parserOptions: {
project: ['./tsconfig.node.json', './tsconfig.app.json'],
tsconfigRootDir: import.meta.dirname,
},
// other options...
},
},
])You can also install eslint-plugin-react-x and eslint-plugin-react-dom for React-specific lint rules:
// eslint.config.js
import reactX from 'eslint-plugin-react-x'
import reactDom from 'eslint-plugin-react-dom'
export default defineConfig([
globalIgnores(['dist']),
{
files: ['**/*.{ts,tsx}'],
extends: [
// Other configs...
// Enable lint rules for React
reactX.configs['recommended-typescript'],
// Enable lint rules for React DOM
reactDom.configs.recommended,
],
languageOptions: {
parserOptions: {
project: ['./tsconfig.node.json', './tsconfig.app.json'],
tsconfigRootDir: import.meta.dirname,
},
// other options...
},
},
])The goal of the demo is to get the agent (black dot) to reach the goal cell in the smallest number of steps consistent with the grid world, which will show up in the middle column of the interface as bright arrows along a certain path that the agent always follows, and on the rightmost column in the reward plots as increasing lines (or flat with relatively large values).
The sliders in the far left column control the environment parameters and all but the "step cost" slider are self-explanatory. The "step cost" slider allows you to add a penalty to the reward the agent sees per episode to encourage them to take less steps. The sliders in the middle column control the agent parameters.
- Memory damping slider: determines how easily the agent remembers their past actions, with a high value indicating that the agent will quickly forget what they did in past episodes.
- Reward coupling slider: controls by what factor the agent feels the reward, with higher values magnifying the reward they receive.
- Glow decay slider: controls the strength of past rewards, so a small value means that all past rewards have equal strength.
- Exploration and Temperature parameter sliders: determine how often the agent will act based on what they learned versus randomly, with larger values of each representing more random movements. The difference between the two is that "exploration" is based on a probability to use what they learned or act randomly, whereas "temperature parameter" just temporarily scrambles what they learned into noise if it is large.