Rcjp's Weblog

December 22, 2006

String Searching/Replacing

Filed under: c, lisp, python — rcjp @ 9:04 am

just some quick notes on how various languages handle replacing/splicing a string…

In C++ chopping, searching and replacing in a string is fairly easy

    #include <iostream>
    #include <string>

    int main()
        std::string s = "some one with more than one that ones.";

        s.replace(s.find("one"), 3, "three");

        std::cout << s << std::endl;

in straight C doing the same thing is a bit more fiddly

    #include <stdio.h>
    #include <string.h>

    int main()
        char* s = "some one with more than one that ones.";

        char buf[255];
        strcpy(buf, s+3);  /* chop off the first 3 chars */

        char *p = strstr(buf, "one");
        if (p)
            char tmp[255];
            *p = (char) 0;
            strcpy(tmp, buf);
            strcat(tmp, "three");
            strcat(tmp, p+3);  /* skip over length of "one" */
            strcpy(buf, tmp);

        printf("%s\n", buf);
        return 0;

you have to think about the size of the temporary buffer unless you malloc something based on the size of s – but then you have to know how much bigger your operations will make the string.

Perhaps suprisingly there isn’t any standard function to replace strings in Common Lisp – I guess its one of those things your are supposed to deal with yourself since if you know you your replacement word is the same size you can destructively alter the string (using setf on the subseq), otherwise you have to build a new string (since you may be replacing a word with a bigger word). So CL leaves things to you to figure out the best approach.

You can get the position of a string within another with

    * (search "one" "there one is one more than ones")

and then build up the string a la C…

    (let* ((str "some one with more than one that ones.")
           (word "one")
           (p (search word str)))
      (if p
          (concatenate 'string (subseq str 3 p)
                               (subseq str (+ p (length word))))

In python

    In [3]: "there one is one more than ones".replace("one", "three", 1)
    Out[3]: 'there three is one more than ones'

where we are using a count=1 otherwise it would replace all instances. We can even chop off the first three chars and then replace all in one go

    In [5]: "there one is one more than ones"[3:].replace("one", "three")
    Out[5]: 're three is three more than threes'

Replacing all occurances in C++ is a bit more work

    std::string::size_type n;  // or   size_t n; 
    std::string word = "one";
    while((n=s.find(word)) != std::string::npos)
        s.replace(n, word.size(), "three");

Replacing all strings in C or Common Lisp is quite alot more work and probably better to write a utility function and keep it somewhere or use a library.
http://en.wikibooks.org/wiki/Programming:Common_Lisp/Strings has

    (defun replace-all (string part replacement &key (test #'char=))
      (with-output-to-string (out)
        (loop with part-length = (length part)
           for old-pos = 0 then (+ pos part-length)
           for pos = (search part string :start2 old-pos :test test)
             (write-string string out :start old-pos
                                      :end (or pos (length string)))
           when pos do
             (write-string replacement out)
           while pos)))

or maybe use cl-ppcre


Leave a Comment »

No comments yet.

RSS feed for comments on this post.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: