Avoid scanning the filesystem to filter files per check#1262
Conversation
|
First, thx for the PR! I don't have access to my dev machine at the moment, but I am reasonably sure that this change will cause problems when piping files via STDIN (via I will have to verify if these scenarios are sufficiently covered by tests right now (and improve the test suite if not) 👍 |
de901fe to
326a676
Compare
326a676 to
45cc6c4
Compare
|
Hey @rrrene, let me know if you want me to convert this into an issue and close this PR, or if you would like me to rebase this with master and resolve the conflicts. |
|
@Wigny sorry for the slow progress on this, I am just really concerned about regressions. We should probably think about re-architecting this functionality, since you correctly raised the fact that for single-run invocations of Let's look up where this function is called from and if we could refactor it to receive an |
|
Hey @rrrene, no problem!
That makes a lot of sense and is completely reasonable.
I could try making this refactor, but I wanted to let you know that this MR was more of an attempt to solve the problem we faced, hoping you can tell if it was going in the right direction or not, since I'm also not 100% sure that those changes will not introduce any undesired behaviour. Therefore, let me know if you want me to pursue this refactor, or if you would like to address it. Again, no need to rush on this, and thanks for maintaining this awesome lib! |
45cc6c4 to
75c0789
Compare
Checks with custom :files patterns previously triggered a recursive scan of the entire project tree on every check run. The filtering is now done in-memory against already-loaded source files. This also makes the STDIN special case unnecessary.
75c0789 to
3241379
Compare
We started noticing
credobecame much slow latelly when trying to execute checks that setfiles.includedglob patterns like the Credo.Check.Refactor.PassAsyncInTestCases.This slowdown seems to be caused, according to my test, by
Credo.Sources.find_in_dir/3, which usesPath.wildcardto list matching file paths from the filesystem, which might be running a recursive scan of the entire project tree.This PR changes the code to filter the already known file paths in memory using
Credo.Sources.filename_matches?instead of listing them from the filesystem.Tested it in our project, and the slowdown was reduced from several minutes to seconds.