最新消息:Welcome to the puzzle paradise for programmers! Here, a well-designed puzzle awaits you. From code logic puzzles to algorithmic challenges, each level is closely centered on the programmer's expertise and skills. Whether you're a novice programmer or an experienced tech guru, you'll find your own challenges on this site. In the process of solving puzzles, you can not only exercise your thinking skills, but also deepen your understanding and application of programming knowledge. Come to start this puzzle journey full of wisdom and challenges, with many programmers to compete with each other and show your programming wisdom! Translated with DeepL.com (free version)

How to get Java to match JavaScript encodeURIComponent() method? - Stack Overflow

matteradmin4PV0评论

I am trying to pass this strings in the URL which contain special characters and the only way I can get it to work is with JavaScript encodeURIComponent('testerๆ๘ๅ') which produces "tester%C3%A6%C3%B8%C3%A5"

Everything I try to do in Java produces different encodings, and do not work on the other end... Any idea how I can get testerๆ๘ๅ encoded to tester%C3%A6%C3%B8%C3%A5 in Java? Thanks in advance!

package .mastercard.cp.sdng.domain.user;

import org.apachemons.lang.StringUtils;

import javax.script.ScriptEngine;
import javax.script.ScriptEngineManager;
import javax.script.ScriptException;
import java.io.UnsupportedEncodingException;
import java.URI;
import java.URISyntaxException;
import java.URLEncoder;

public class UrlEncodingSample
{
    public static void main(String[] args)
    {
        String userId = "dummy";
        try
        {
            validateEncoding(userId);

            userId = "testeræøå";

            validateEncoding(userId);

            userId = URLEncoder.encode(userId);

            validateEncoding(userId);
        }
        catch (UnsupportedEncodingException e)
        {
            e.printStackTrace();  //To change body of catch statement use File | Settings | File Templates.
        }

    }

    private static void validateEncoding(String userId) throws UnsupportedEncodingException
    {
        System.out.println("------ START TESTING WITH USER ID = '"+userId+"' ----------------------");
        System.out.println("Test URLEncoder.encode(userId): " + URLEncoder.encode(userId));
        System.out.println("Test URLEncoder.encode(userId,\"UTF-8\"): " + URLEncoder.encode(userId, "UTF-8"));
        System.out.println("Test URLEncoder.encode(userId,\"UTF-16\"): " + URLEncoder.encode(userId,"UTF-16"));
        System.out.println("Test URLEncoder.encode(userId,\"UTF-16LE\"): " + URLEncoder.encode(userId,"UTF-16LE"));
        System.out.println("Test URLEncoder.encode(userId,\"UTF-16BE\"): " + URLEncoder.encode(userId,"UTF-16BE"));

        ScriptEngine engine = new ScriptEngineManager().getEngineByName("JavaScript");
        try
        {
            System.out.println("Test engine.eval(\"encodeURIComponent(\\\"\"+userId+\"\\\")\"): " +
                    engine.eval("encodeURIComponent(\""+userId+"\")"));
        }
        catch (ScriptException e)
        {
            e.printStackTrace();  //To change body of catch statement use File | Settings | File Templates.
        }
        System.out.println("Test encodeURIComponent(userId): " + encodeURIComponent(userId));
        try
        {
            System.out.println("TEST new URI(userId).toASCIIString(): " + new URI(userId).toASCIIString());
        }
        catch (URISyntaxException e)
        {
            e.printStackTrace();  //To change body of catch statement use File | Settings | File Templates.
        }
        System.out.println("------ END TESTING WITH USER ID = '"+userId+"' ----------------------\n\n");

    }



    public static String encodeURIComponent(String input) {
        if(StringUtils.isEmpty(input)) {
            return input;
        }

        int l = input.length();
        StringBuilder o = new StringBuilder(l * 3);
        try {
            for (int i = 0; i < l; i++) {
                String e = input.substring(i, i + 1);
                if (ALLOWED_CHARS.indexOf(e) == -1) {
                    byte[] b = e.getBytes("utf-8");
                    o.append(getHex(b));
                    continue;
                }
                o.append(e);
            }
            return o.toString();
        } catch(UnsupportedEncodingException e) {
            e.printStackTrace();
        }
        return input;
    }

    private static String getHex(byte buf[]) {
        StringBuilder o = new StringBuilder(buf.length * 3);
        for (int i = 0; i < buf.length; i++) {
            int n = (int) buf[i] & 0xff;
            o.append("%");
            if (n < 0x10) {
                o.append("0");
            }
            o.append(Long.toString(n, 16).toUpperCase());
        }
        return o.toString();
    }

    public static final String ALLOWED_CHARS = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789-_.!~*'()";
}

Output of above class is this:


    ------ START TESTING WITH USER ID = 'dummy' ----------------------
    Test URLEncoder.encode(userId): dummy
    Test URLEncoder.encode(userId,"UTF-8"): dummy
    Test URLEncoder.encode(userId,"UTF-16"): dummy
    Test URLEncoder.encode(userId,"UTF-16LE"): dummy
    Test URLEncoder.encode(userId,"UTF-16BE"): dummy
    Test engine.eval("encodeURIComponent(\""+userId+"\")"): dummy
    Test encodeURIComponent(userId): dummy
    TEST new URI(userId).toASCIIString(): dummy
    ------ END TESTING WITH USER ID = 'dummy' ----------------------


    ------ START TESTING WITH USER ID = 'testerๆ๘ๅ' ----------------------
    Test URLEncoder.encode(userId): tester%E6%F8%E5
    Test URLEncoder.encode(userId,"UTF-8"): tester%E0%B9%86%E0%B9%98%E0%B9%85
    Test URLEncoder.encode(userId,"UTF-16"): tester%FE%FF%0E%46%0E%58%0E%45
    Test URLEncoder.encode(userId,"UTF-16LE"): tester%46%0E%58%0E%45%0E
    Test URLEncoder.encode(userId,"UTF-16BE"): tester%0E%46%0E%58%0E%45
    Test engine.eval("encodeURIComponent(\""+userId+"\")"): tester%e0%b9%86%e0%b9%98%e0%b9%85
    Test encodeURIComponent(userId): tester%E0%B9%86%E0%B9%98%E0%B9%85
    TEST new URI(userId).toASCIIString(): tester%E0%B9%86%E0%B9%98%E0%B9%85
    ------ END TESTING WITH USER ID = 'testerๆ๘ๅ' ----------------------


    ------ START TESTING WITH USER ID = 'tester%E6%F8%E5' ----------------------
    Test URLEncoder.encode(userId): tester%25E6%25F8%25E5
    Test URLEncoder.encode(userId,"UTF-8"): tester%25E6%25F8%25E5
    Test URLEncoder.encode(userId,"UTF-16"): tester%FE%FF%00%25E6%FE%FF%00%25F8%FE%FF%00%25E5
    Test URLEncoder.encode(userId,"UTF-16LE"): tester%25%00E6%25%00F8%25%00E5
    Test URLEncoder.encode(userId,"UTF-16BE"): tester%00%25E6%00%25F8%00%25E5
    Test engine.eval("encodeURIComponent(\""+userId+"\")"): tester%25E6%25F8%25E5
    Test encodeURIComponent(userId): tester%25E6%25F8%25E5
    TEST new URI(userId).toASCIIString(): tester%E6%F8%E5
    ------ END TESTING WITH USER ID = 'tester%E6%F8%E5' ----------------------

Note: As I was writing this up, it occurred to me that I could use the URLEncoder.encode(userId, "UTF-8") as long as I used the proper decoder on the other side... but I was still trying to find a way to encode it to match the JavaScript encodeURIComponent function which apparently works without the need to decode it on the other side. :)

I am trying to pass this strings in the URL which contain special characters and the only way I can get it to work is with JavaScript encodeURIComponent('testerๆ๘ๅ') which produces "tester%C3%A6%C3%B8%C3%A5"

Everything I try to do in Java produces different encodings, and do not work on the other end... Any idea how I can get testerๆ๘ๅ encoded to tester%C3%A6%C3%B8%C3%A5 in Java? Thanks in advance!

package .mastercard.cp.sdng.domain.user;

import org.apache.mons.lang.StringUtils;

import javax.script.ScriptEngine;
import javax.script.ScriptEngineManager;
import javax.script.ScriptException;
import java.io.UnsupportedEncodingException;
import java.URI;
import java.URISyntaxException;
import java.URLEncoder;

public class UrlEncodingSample
{
    public static void main(String[] args)
    {
        String userId = "dummy";
        try
        {
            validateEncoding(userId);

            userId = "testeræøå";

            validateEncoding(userId);

            userId = URLEncoder.encode(userId);

            validateEncoding(userId);
        }
        catch (UnsupportedEncodingException e)
        {
            e.printStackTrace();  //To change body of catch statement use File | Settings | File Templates.
        }

    }

    private static void validateEncoding(String userId) throws UnsupportedEncodingException
    {
        System.out.println("------ START TESTING WITH USER ID = '"+userId+"' ----------------------");
        System.out.println("Test URLEncoder.encode(userId): " + URLEncoder.encode(userId));
        System.out.println("Test URLEncoder.encode(userId,\"UTF-8\"): " + URLEncoder.encode(userId, "UTF-8"));
        System.out.println("Test URLEncoder.encode(userId,\"UTF-16\"): " + URLEncoder.encode(userId,"UTF-16"));
        System.out.println("Test URLEncoder.encode(userId,\"UTF-16LE\"): " + URLEncoder.encode(userId,"UTF-16LE"));
        System.out.println("Test URLEncoder.encode(userId,\"UTF-16BE\"): " + URLEncoder.encode(userId,"UTF-16BE"));

        ScriptEngine engine = new ScriptEngineManager().getEngineByName("JavaScript");
        try
        {
            System.out.println("Test engine.eval(\"encodeURIComponent(\\\"\"+userId+\"\\\")\"): " +
                    engine.eval("encodeURIComponent(\""+userId+"\")"));
        }
        catch (ScriptException e)
        {
            e.printStackTrace();  //To change body of catch statement use File | Settings | File Templates.
        }
        System.out.println("Test encodeURIComponent(userId): " + encodeURIComponent(userId));
        try
        {
            System.out.println("TEST new URI(userId).toASCIIString(): " + new URI(userId).toASCIIString());
        }
        catch (URISyntaxException e)
        {
            e.printStackTrace();  //To change body of catch statement use File | Settings | File Templates.
        }
        System.out.println("------ END TESTING WITH USER ID = '"+userId+"' ----------------------\n\n");

    }



    public static String encodeURIComponent(String input) {
        if(StringUtils.isEmpty(input)) {
            return input;
        }

        int l = input.length();
        StringBuilder o = new StringBuilder(l * 3);
        try {
            for (int i = 0; i < l; i++) {
                String e = input.substring(i, i + 1);
                if (ALLOWED_CHARS.indexOf(e) == -1) {
                    byte[] b = e.getBytes("utf-8");
                    o.append(getHex(b));
                    continue;
                }
                o.append(e);
            }
            return o.toString();
        } catch(UnsupportedEncodingException e) {
            e.printStackTrace();
        }
        return input;
    }

    private static String getHex(byte buf[]) {
        StringBuilder o = new StringBuilder(buf.length * 3);
        for (int i = 0; i < buf.length; i++) {
            int n = (int) buf[i] & 0xff;
            o.append("%");
            if (n < 0x10) {
                o.append("0");
            }
            o.append(Long.toString(n, 16).toUpperCase());
        }
        return o.toString();
    }

    public static final String ALLOWED_CHARS = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789-_.!~*'()";
}

Output of above class is this:


    ------ START TESTING WITH USER ID = 'dummy' ----------------------
    Test URLEncoder.encode(userId): dummy
    Test URLEncoder.encode(userId,"UTF-8"): dummy
    Test URLEncoder.encode(userId,"UTF-16"): dummy
    Test URLEncoder.encode(userId,"UTF-16LE"): dummy
    Test URLEncoder.encode(userId,"UTF-16BE"): dummy
    Test engine.eval("encodeURIComponent(\""+userId+"\")"): dummy
    Test encodeURIComponent(userId): dummy
    TEST new URI(userId).toASCIIString(): dummy
    ------ END TESTING WITH USER ID = 'dummy' ----------------------


    ------ START TESTING WITH USER ID = 'testerๆ๘ๅ' ----------------------
    Test URLEncoder.encode(userId): tester%E6%F8%E5
    Test URLEncoder.encode(userId,"UTF-8"): tester%E0%B9%86%E0%B9%98%E0%B9%85
    Test URLEncoder.encode(userId,"UTF-16"): tester%FE%FF%0E%46%0E%58%0E%45
    Test URLEncoder.encode(userId,"UTF-16LE"): tester%46%0E%58%0E%45%0E
    Test URLEncoder.encode(userId,"UTF-16BE"): tester%0E%46%0E%58%0E%45
    Test engine.eval("encodeURIComponent(\""+userId+"\")"): tester%e0%b9%86%e0%b9%98%e0%b9%85
    Test encodeURIComponent(userId): tester%E0%B9%86%E0%B9%98%E0%B9%85
    TEST new URI(userId).toASCIIString(): tester%E0%B9%86%E0%B9%98%E0%B9%85
    ------ END TESTING WITH USER ID = 'testerๆ๘ๅ' ----------------------


    ------ START TESTING WITH USER ID = 'tester%E6%F8%E5' ----------------------
    Test URLEncoder.encode(userId): tester%25E6%25F8%25E5
    Test URLEncoder.encode(userId,"UTF-8"): tester%25E6%25F8%25E5
    Test URLEncoder.encode(userId,"UTF-16"): tester%FE%FF%00%25E6%FE%FF%00%25F8%FE%FF%00%25E5
    Test URLEncoder.encode(userId,"UTF-16LE"): tester%25%00E6%25%00F8%25%00E5
    Test URLEncoder.encode(userId,"UTF-16BE"): tester%00%25E6%00%25F8%00%25E5
    Test engine.eval("encodeURIComponent(\""+userId+"\")"): tester%25E6%25F8%25E5
    Test encodeURIComponent(userId): tester%25E6%25F8%25E5
    TEST new URI(userId).toASCIIString(): tester%E6%F8%E5
    ------ END TESTING WITH USER ID = 'tester%E6%F8%E5' ----------------------

Note: As I was writing this up, it occurred to me that I could use the URLEncoder.encode(userId, "UTF-8") as long as I used the proper decoder on the other side... but I was still trying to find a way to encode it to match the JavaScript encodeURIComponent function which apparently works without the need to decode it on the other side. :)

Share asked Aug 13, 2014 at 23:57 mattssmithmattssmith 431 gold badge1 silver badge4 bronze badges
Add a ment  | 

2 Answers 2

Reset to default 6

According to Mozilla Developer Docs encodeURICompoent() uses UTF-8 to encode. When I run this on your string I get tester%C3%A6%C3%B8%C3%A5 as expected. When i run the following Java code:

System.out.println(URLEncoder.encode("testeræøå", "UTF-8"));

It also prints tester%C3%A6%C3%B8%C3%A5. I also ran your test and got:

    ------ START TESTING WITH USER ID = 'dummy' ----------------------
Test URLEncoder.encode(userId): dummy
Test URLEncoder.encode(userId,"UTF-8"): dummy
Test URLEncoder.encode(userId,"UTF-16"): dummy
Test URLEncoder.encode(userId,"UTF-16LE"): dummy
Test URLEncoder.encode(userId,"UTF-16BE"): dummy
Test engine.eval("encodeURIComponent(\""+userId+"\")"): dummy
Test encodeURIComponent(userId): dummy
TEST new URI(userId).toASCIIString(): dummy
------ END TESTING WITH USER ID = 'dummy' ----------------------


------ START TESTING WITH USER ID = 'testeræøå' ----------------------
Test URLEncoder.encode(userId): tester%C3%A6%C3%B8%C3%A5
Test URLEncoder.encode(userId,"UTF-8"): tester%C3%A6%C3%B8%C3%A5
Test URLEncoder.encode(userId,"UTF-16"): tester%FE%FF%00%E6%00%F8%00%E5
Test URLEncoder.encode(userId,"UTF-16LE"): tester%E6%00%F8%00%E5%00
Test URLEncoder.encode(userId,"UTF-16BE"): tester%00%E6%00%F8%00%E5
Test engine.eval("encodeURIComponent(\""+userId+"\")"): tester%C3%A6%C3%B8%C3%A5
Test encodeURIComponent(userId): tester%C3%A6%C3%B8%C3%A5
TEST new URI(userId).toASCIIString(): tester%C3%A6%C3%B8%C3%A5
------ END TESTING WITH USER ID = 'testeræøå' ----------------------


------ START TESTING WITH USER ID = 'tester%C3%A6%C3%B8%C3%A5' ----------------------
Test URLEncoder.encode(userId): tester%25C3%25A6%25C3%25B8%25C3%25A5
Test URLEncoder.encode(userId,"UTF-8"): tester%25C3%25A6%25C3%25B8%25C3%25A5
Test URLEncoder.encode(userId,"UTF-16"): tester%FE%FF%00%25C3%FE%FF%00%25A6%FE%FF%00%25C3%FE%FF%00%25B8%FE%FF%00%25C3%FE%FF%00%25A5
Test URLEncoder.encode(userId,"UTF-16LE"): tester%25%00C3%25%00A6%25%00C3%25%00B8%25%00C3%25%00A5
Test URLEncoder.encode(userId,"UTF-16BE"): tester%00%25C3%00%25A6%00%25C3%00%25B8%00%25C3%00%25A5
Test engine.eval("encodeURIComponent(\""+userId+"\")"): tester%25C3%25A6%25C3%25B8%25C3%25A5
Test encodeURIComponent(userId): tester%25C3%25A6%25C3%25B8%25C3%25A5
TEST new URI(userId).toASCIIString(): tester%C3%A6%C3%B8%C3%A5
------ END TESTING WITH USER ID = 'tester%C3%A6%C3%B8%C3%A5' ----------------------

This is what I would expect.

I think you need to check the file encoding for your Java source file. If you are using Eclipse it defaults to cp1252 for some reason. The first thing I do when I install Eclipse is change the default encoding to UTF-8.

For others stumbling upon this query & noticing that (space) translates to + in java but %20 in javascript.

One possible solution is to use org.apache.mons.httpclient.util.URIUtil#encodeQuery

If you're using the latest httpclient 4, then URIParserUtil#escapeChars can be used instead.

Sample Code : URIUtil.encodeQuery(strQuery); //httpclient 3.x URIParserUtil.escapeChars(strQuery); //httpclient 4.x

Post a comment

comment list (0)

  1. No comments so far