最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

c# - how to tokenizeparse string literals from javascript source code - Stack Overflow

programmeradmin1浏览0评论

I am working on a program in C# that needs to load some JavaScript code, parse it and do some processing to the string literals found in the code (such as overwrite them with something else).

My problem is that I'm having a difficult time devising an elegant way to actually find the string literals in the JavaScript code in the first place.

For example, take a look at the sample JavaScript code below. Do you see how even Stack Overflow's code highlighter is able to pick out string literals in the code and make them red in color?

I want to basically do the same thing, except I will not be turning them into a different color, but I will do some processing on them and possibly replace it with an entirely different string literal.

var dp = {
    sh :                    // dp.sh
    {
        Utils   : {},       // dp.sh.Utils
        Brushes : {},       // dp.sh.Brushes
        Strings : {},
        Version : '1.3.0'
    }
};

dp.sh.Strings = {
    AboutDialog : '<html><head><title>About...</title></head><body class="dp-about"><table cellspacing="0"><tr><td class="copy"><p class="title">dp.SyntaxHighlighter</div><div class="para">Version: {V}</p><p><a href="/?ref=about" target="_blank">;/a></p>&copy;2004-2005 Alex Gorbatchev. All right reserved.</td></tr><tr><td class="footer"><input type="button" class="close" value="OK" onClick="window.close()"/></td></tr></table></body></html>',
    
    // tools
    ExpandCode : '+ expand code',
    ViewPlain : 'view plain',
    Print : 'print',
    CopyToClipboard : 'copy to clipboard',
    About : '?',
    
    CopiedToClipboard : 'The code is in your clipboard now.'
};

dp.test1 = 'some test blah blah blah' +  someFunction()  + 'asdfasdfsdf';
dp.test2 = 'some test blah blah blah' +  'xxxxx'  + 'asdfasdfsdf';
dp.test3 = 'some test blah blah blah' +  "XXXXsdf "" \" \' ' sdfdff "" \" \' ' asdfASDaSD FASDF SDF'  + 'asdfasdfsdf";

dp.SyntaxHighlighter = dp.sh;

I have tried parsing through looking for quotes, but it gets plicated when you have escape characters in the string literal. The other solution I was thinking is to use a RegEx, but I am not strong enough with Regular Expressions and I'm not even sure if that is the avenue I should be perusing.

I would like to see what Stack Overflow thinks. Thanks a bunch!

I am working on a program in C# that needs to load some JavaScript code, parse it and do some processing to the string literals found in the code (such as overwrite them with something else).

My problem is that I'm having a difficult time devising an elegant way to actually find the string literals in the JavaScript code in the first place.

For example, take a look at the sample JavaScript code below. Do you see how even Stack Overflow's code highlighter is able to pick out string literals in the code and make them red in color?

I want to basically do the same thing, except I will not be turning them into a different color, but I will do some processing on them and possibly replace it with an entirely different string literal.

var dp = {
    sh :                    // dp.sh
    {
        Utils   : {},       // dp.sh.Utils
        Brushes : {},       // dp.sh.Brushes
        Strings : {},
        Version : '1.3.0'
    }
};

dp.sh.Strings = {
    AboutDialog : '<html><head><title>About...</title></head><body class="dp-about"><table cellspacing="0"><tr><td class="copy"><p class="title">dp.SyntaxHighlighter</div><div class="para">Version: {V}</p><p><a href="http://www.dreamprojections./syntaxhighlighter/?ref=about" target="_blank">http://www.dreamprojections./SyntaxHighlighter</a></p>&copy;2004-2005 Alex Gorbatchev. All right reserved.</td></tr><tr><td class="footer"><input type="button" class="close" value="OK" onClick="window.close()"/></td></tr></table></body></html>',
    
    // tools
    ExpandCode : '+ expand code',
    ViewPlain : 'view plain',
    Print : 'print',
    CopyToClipboard : 'copy to clipboard',
    About : '?',
    
    CopiedToClipboard : 'The code is in your clipboard now.'
};

dp.test1 = 'some test blah blah blah' +  someFunction()  + 'asdfasdfsdf';
dp.test2 = 'some test blah blah blah' +  'xxxxx'  + 'asdfasdfsdf';
dp.test3 = 'some test blah blah blah' +  "XXXXsdf "" \" \' ' sdfdff "" \" \' ' asdfASDaSD FASDF SDF'  + 'asdfasdfsdf";

dp.SyntaxHighlighter = dp.sh;

I have tried parsing through looking for quotes, but it gets plicated when you have escape characters in the string literal. The other solution I was thinking is to use a RegEx, but I am not strong enough with Regular Expressions and I'm not even sure if that is the avenue I should be perusing.

I would like to see what Stack Overflow thinks. Thanks a bunch!

Share Improve this question edited Sep 26, 2020 at 15:12 Alex 1,5801 gold badge15 silver badges27 bronze badges asked Mar 24, 2009 at 6:05 7wp7wp 12.7k20 gold badges79 silver badges108 bronze badges
Add a ment  | 

1 Answer 1

Reset to default 7

Regexs in Depth: Advanced Quoted String Matching has some good examples of how to do this with a regex.

One of the approaches is this:

(["'])(?:(?!\1)[^\\]|\\.)*\1

You could use it as follows:

string modifiedJavascriptText =
   Regex.Replace
   (
      javascriptText, 
      @"([""'])(?:(?!\1)[^\\]|\\.)*\1", // Note the escaped quote
      new MatchEvaluator
      (
         delegate(Match m) 
         { 
            return m.Value.ToUpper(); 
         }
      )
   );

in this case, all of the string literals are made upper case.

发布评论

评论列表(0)

  1. 暂无评论