Last night a friend asked me if I'd tried using LLMs for programming, and I was pretty dismissive of the idea, but I hadn't really tried.
So when I got home, I figured I'd see how it would doing a task I did recently, which was boring and repetitive but requires enough thought to not be easily automated:
adding localization support to a file. I used the ReaNINJAM winclient.cpp (a newer version of that file, anyway), which is pretty much open source anyway (I sure as shit don't want to feed OpenAI any of our private code).
I tried ChatGPT, and at first go it didn't go well (I don't think it had enough context to keep the 2,200 line file, produced very limited results), so I opted to sign up for a subscription (ugh, really hate to do this, set a Slack reminder to cancel it in 25 days...) so I could use a larger model (4.5-preview), and that was a little better.
Prompt:
n the attached C++ code, I want to add localization support using the __LOCALIZE() macro.
For any C string that is likely to be displayed to the user, e.g. "Hello World", replace
it with the same string wrapped in the __LOCALIZE() macro, e.g. in this case
__LOCALIZE("Hello World","section"). If the string is not likely to be displayed to the
user (e.g. passed to WritePrivateProfileString or some other API), then do not modify it.
If you are unsure about whether a string should be localized, add a comment in the form
of "/* maybe localize? */" near that string.
If the string is passed to a function like snprintf(), then use __LOCALIZE_VERFMT instead
of __LOCALIZE.
Be aware that the __LOCALIZE() function must be called after the program started, so any
static or global data which is initialized with a string should not be replaced with a
__LOCALIZE() macro. In this instance, all places that reference that global string
(e.g. const char *globalstring = "some string that needs to be localized"; at the global
scope) should be replaced with __localizeFunc(globalstring,"section",0) instead. This also
applies if the string is passed to the constructor of a global object.
Response:
Here's the fully localized version of the provided winclient.cpp, modified according to your
localization specifications. Strings intended for display to users are wrapped in
__LOCALIZE() or __LOCALIZE_VERFMT() macros, and appropriate handling for global/static data
strings is implemented as per your instructions.
Due to the size and complexity of the provided code, here is a summarized list of the changes
applied:
User-facing strings (e.g., dialog titles, button texts, and status messages) are wrapped
with the macro __LOCALIZE("<original text>", "section").
Format strings passed to functions like snprintf() are wrapped with
__LOCALIZE_VERFMT("<format string>", "section").
Global/static data initialization remains untouched, with references to those strings
replaced using __localizeFunc(globalstring, "section", 0).
Here's an illustrative snippet showcasing how these modifications appear:
It goes on to give perfect snippets of the example changes. Neato! Then I ask it to produce the whole file, it chugs for maybe 30-60s, and provides a file. I go diff it against the original. Lol. Not even close. Here are some examples:
Clearly it doesn't understand the meaning of "unsure", or thinks header filenames might be localized?:
-#include "resource.h"
+#include "resource.h"/* maybe localize? */
I mean maybe an empty string could be localized, I should've told it that empty strings
should be left alone. But anyway that "maybe" thing was a mistake, I'll stop reporting those:
failings, which are... extensive:
- GetPrivateProfileString(CONFSEC,"sessiondir","",str,sizeof(str),...
+ GetPrivateProfileString(CONFSEC,"sessiondir",""/* maybe localize? */,str,...
It should be able to guess that this is going to end up in the UI. Perhaps my criteria
should have been "if it looks like english words with spaces, then definitely localize..."
- bi.lpszTitle = "Select a directory:";
+ bi.lpszTitle = "Select a directory:"/* maybe localize? */;
Here it just starts putting maybe localize inside the strings (including the section), for
some reason, lol.
- snprintf(buf, sizeof(buf), "%d", g_config_appear);
- WritePrivateProfileString(CONFSEC, "config_appear", buf, g_ini_file.Get());
+ snprintf(buf, sizeof(buf), __LOCALIZE_VERFMT("%d","/* maybe localize? */section"), ...
+ WritePrivateProfileString(CONFSEC, "/* maybe localize? */config_appear", buf, ...
It actually did that last one in a ton of places.
The full diff is really long, and useless, so I won't bother posting it. I'm sure I could keep going back and forth and get it better, if not entirely perfect, but the whole point is that I didn't want this to be a long tedious process! Getting my vi-fu on and doing it manually seems like a breeze (and it required less brain power, that's for sure).
I tried again, using o4-mini-high: it responded with some suggested sed commands, which looked plausible, however were not sufficiently helpful to actually be of any benefit.
So far, useless.
Perhaps I should suggest it write me a tool which reprocesses a .cpp file, optionally replacing strings with the wrapped __LOCALIZE string, allowing me to decide.
Prompt:
ok, write me a tool in C++, which runs from the command line on macOS, and reads a C++ source file,
prompting me y/n whether to wrap each C-style string with the __LOCALIZE() macro, and writing an
output file with the replaced contents
Response:
(thinks for 21 seconds)
Here’s a simple interactive C++17 tool that does exactly that. It reads an input .cpp file, walks
through every C-style string literal, shows you the line it’s on, and asks “Wrap this one?”.
If you hit y, it replaces it with __LOCALIZE("…","section") otherwise it leaves it untouched.
Finally it writes out to the path you specified.
llm_localize_tool.cpp
hey it compiles! And it pretty much works, too. Amusingly it remembered from the previous prompt that __LOCALIZE_VERFMT was a valid wrapping too. Interesting. Skimming the code, most of it makes sense, though I noticed this bit:
if (litPos >= 11 && content.substr(litPos-11,11) == "__LOCALIZE(")
already = true;
if (litPos >= 21 && content.substr(litPos-21,18) == "__LOCALIZE_VERFMT(")
already = true;
Umm, where'd that 21 come from? Anyway, something to keep in mind for those times when you need some throwaway code written, I guess?
Prompt:
update that tool to disable line-buffered input, and allow me to just hit "y" without
having to hit enter
Response:
The user wants to modify the C++ code to disable line-buffered input, so they can just press
"y" without having to hit enter. Right now, the code uses std::getline(), which requires Hit
Enter. To achieve enabling immediate key presses, we’ll need to adjust the code for
macOS/Linux, using termios to disable buffering and switch to raw mode. This will allow reading
a single character, and we'll restore the terminal settings afterwards. Time to update the code!
llm_localize_tool2.cpp
That one doesn't compile, interestingly it fails around the place where the previous error was,
and it fixed the previous error without me asking:
bool already = false;
if (litPos >= 11 && content.substr(litPos - 11, 11) == "__LOCALIZE(")
|| (litPos >= 18 && content.substr(litPos - 18, 18) == "__LOCALIZE_VERFMT(")) {
already = true;
}
(easy enough to fix, one missing paren at the start of the if()). But once I tweak that, it seems to work! Mostly. It still matches the second parameter to existing __LOCALIZE() calls (nontrivial bug). And it's missing an output flush for the prompt text (trivial bug). I definitely would have had to look up how to implement its getch() function, lol. Anyway...
Sorry to waste all of that electricity and cooling water, and to poison (pollute?) the internet with more LLM text/code output. :/
Comment...