commons_io_npe_7

준혁·2024년 7월 1일

실질 에러함수는 이와 같음.

        public Builder setCharsetEncoder(final CharsetEncoder charsetEncoder) {
            this.charsetEncoder = charsetEncoder; (139)
            super.setCharset(charsetEncoder.charset()); (140)
            return this;
        }

Provide a short code description of the following code:

/*
 * Licensed to the Apache Software Foundation (ASF) under one or more
 * contributor license agreements.  See the NOTICE file distributed with
 * this work for additional information regarding copyright ownership.
 * The ASF licenses this file to You under the Apache License, Version 2.0
 * (the "License"); you may not use this file except in compliance with
 * the License.  You may obtain a copy of the License at
 *
 *      http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */
package org.apache.commons.io.input;

import static org.apache.commons.io.IOUtils.EOF;

import java.io.IOException;
import java.io.InputStream;
import java.io.Reader;
import java.nio.ByteBuffer;
import java.nio.CharBuffer;
import java.nio.charset.Charset;
import java.nio.charset.CharsetEncoder;
import java.nio.charset.CoderResult;
import java.nio.charset.CodingErrorAction;
import java.util.Objects;

import org.apache.commons.io.Charsets;
import org.apache.commons.io.IOUtils;
import org.apache.commons.io.build.AbstractOrigin;
import org.apache.commons.io.build.AbstractStreamBuilder;
import org.apache.commons.io.charset.CharsetEncoders;

/**
 * {@link InputStream} implementation that reads a character stream from a {@link Reader} and transforms it to a byte stream using a specified charset encoding.
 * The stream is transformed using a {@link CharsetEncoder} object, guaranteeing that all charset encodings supported by the JRE are handled correctly. In
 * particular for charsets such as UTF-16, the implementation ensures that one and only one byte order marker is produced.
 * <p>
 * Since in general it is not possible to predict the number of characters to be read from the {@link Reader} to satisfy a read request on the
 * {@link ReaderInputStream}, all reads from the {@link Reader} are buffered. There is therefore no well defined correlation between the current position of the
 * {@link Reader} and that of the {@link ReaderInputStream}. This also implies that in general there is no need to wrap the underlying {@link Reader} in a
 * {@link java.io.BufferedReader}.
 * </p>
 * <p>
 * {@link ReaderInputStream} implements the inverse transformation of {@link java.io.InputStreamReader}; in the following example, reading from {@code in2}
 * would return the same byte sequence as reading from {@code in} (provided that the initial byte sequence is legal with respect to the charset encoding):
 * </p>
 * <p>
 * To build an instance, see {@link Builder}.
 * </p>
 * <pre>
 * InputStream inputStream = ...
 * Charset cs = ...
 * InputStreamReader reader = new InputStreamReader(inputStream, cs);
 * ReaderInputStream in2 = ReaderInputStream.builder()
 *   .setReader(reader)
 *   .setCharset(cs)
 *   .get();
 * </pre>
 * <p>
 * {@link ReaderInputStream} implements the same transformation as {@link java.io.OutputStreamWriter}, except that the control flow is reversed: both classes
 * transform a character stream into a byte stream, but {@link java.io.OutputStreamWriter} pushes data to the underlying stream, while {@link ReaderInputStream}
 * pulls it from the underlying stream.
 * </p>
 * <p>
 * Note that while there are use cases where there is no alternative to using this class, very often the need to use this class is an indication of a flaw in
 * the design of the code. This class is typically used in situations where an existing API only accepts an {@link InputStream}, but where the most natural way
 * to produce the data is as a character stream, i.e. by providing a {@link Reader} instance. An example of a situation where this problem may appear is when
 * implementing the {@code javax.activation.DataSource} interface from the Java Activation Framework.
 * </p>
 * <p>
 * The {@link #available()} method of this class always returns 0. The methods {@link #mark(int)} and {@link #reset()} are not supported.
 * </p>
 * <p>
 * Instances of {@link ReaderInputStream} are not thread safe.
 * </p>
 *
 * @see org.apache.commons.io.output.WriterOutputStream
 * @since 2.0
 */
public class ReaderInputStream extends InputStream {

    /**
     * Builds a new {@link ReaderInputStream} instance.
     * <p>
     * For example:
     * </p>
     * <pre>{@code
     * ReaderInputStream s = ReaderInputStream.builder()
     *   .setPath(path)
     *   .setCharsetEncoder(Charset.defaultCharset().newEncoder())
     *   .get();}
     * </pre>
     *
     * @since 2.12.0
     */
    public static class Builder extends AbstractStreamBuilder<ReaderInputStream, Builder> {

        private CharsetEncoder charsetEncoder = super.getCharset().newEncoder();

        /**
         * Constructs a new instance.
         * <p>
         * This builder use the aspects Reader, Charset, CharsetEncoder, buffer size.
         * </p>
         * <p>
         * You must provide an origin that can be converted to a Reader by this builder, otherwise, this call will throw an
         * {@link UnsupportedOperationException}.
         * </p>
         *
         * @return a new instance.
         * @throws UnsupportedOperationException if the origin cannot provide a Reader.
         * @throws IllegalStateException if the {@code origin} is {@code null}.
         * @see AbstractOrigin#getReader(Charset)
         */
        @SuppressWarnings("resource")
        @Override
        public ReaderInputStream get() throws IOException {
            return new ReaderInputStream(checkOrigin().getReader(getCharset()), charsetEncoder, getBufferSize());
        }

        @Override
        public Builder setCharset(final Charset charset) {
            charsetEncoder = charset.newEncoder();
            return super.setCharset(charset);
        }

        /**
         * Sets the charset encoder.
         *
         * @param charsetEncoder the charset encoder.
         * @return this
         */
        public Builder setCharsetEncoder(final CharsetEncoder charsetEncoder) {
            this.charsetEncoder = charsetEncoder;
            super.setCharset(charsetEncoder.charset());
            return this;
        }

    }

    /**
     * Constructs a new {@link Builder}.
     *
     * @return a new {@link Builder}.
     * @since 2.12.0
     */
    public static Builder builder() {
        return new Builder();
    }

    static int checkMinBufferSize(final CharsetEncoder charsetEncoder, final int bufferSize) {
        final float minRequired = minBufferSize(charsetEncoder);
        if (bufferSize < minRequired) {
            throw new IllegalArgumentException(String.format("Buffer size %,d must be at least %s for a CharsetEncoder %s.", bufferSize, minRequired,
                    charsetEncoder.charset().displayName()));
        }
        return bufferSize;
    }

    static float minBufferSize(final CharsetEncoder charsetEncoder) {
        return charsetEncoder.maxBytesPerChar() * 2;
    }

    private final Reader reader;

    private final CharsetEncoder charsetEncoder;

    /**
     * CharBuffer used as input for the decoder. It should be reasonably large as we read data from the underlying Reader into this buffer.
     */
    private final CharBuffer encoderIn;
    /**
     * ByteBuffer used as output for the decoder. This buffer can be small as it is only used to transfer data from the decoder to the buffer provided by the
     * caller.
     */
    private final ByteBuffer encoderOut;

    private CoderResult lastCoderResult;

    private boolean endOfInput;

    /**
     * Constructs a new {@link ReaderInputStream} that uses the default character encoding with a default input buffer size of
     * {@value IOUtils#DEFAULT_BUFFER_SIZE} characters.
     *
     * @param reader the target {@link Reader}
     * @deprecated Use {@link ReaderInputStream#builder()} instead
     */
    @Deprecated
    public ReaderInputStream(final Reader reader) {
        this(reader, Charset.defaultCharset());
    }

    /**
     * Constructs a new {@link ReaderInputStream} with a default input buffer size of {@value IOUtils#DEFAULT_BUFFER_SIZE} characters.
     *
     * <p>
     * The encoder created for the specified charset will use {@link CodingErrorAction#REPLACE} for malformed input and unmappable characters.
     * </p>
     *
     * @param reader  the target {@link Reader}
     * @param charset the charset encoding
     * @deprecated Use {@link ReaderInputStream#builder()} instead, will be protected for subclasses.
     */
    @Deprecated
    public ReaderInputStream(final Reader reader, final Charset charset) {
        this(reader, charset, IOUtils.DEFAULT_BUFFER_SIZE);
    }

    /**
     * Constructs a new {@link ReaderInputStream}.
     *
     * <p>
     * The encoder created for the specified charset will use {@link CodingErrorAction#REPLACE} for malformed input and unmappable characters.
     * </p>
     *
     * @param reader     the target {@link Reader}.
     * @param charset    the charset encoding.
     * @param bufferSize the size of the input buffer in number of characters.
     * @deprecated Use {@link ReaderInputStream#builder()} instead
     */
    @Deprecated
    public ReaderInputStream(final Reader reader, final Charset charset, final int bufferSize) {
        // @formatter:off
        this(reader,
                Charsets.toCharset(charset).newEncoder()
                        .onMalformedInput(CodingErrorAction.REPLACE)
                        .onUnmappableCharacter(CodingErrorAction.REPLACE),
                bufferSize);
        // @formatter:on
    }

    /**
     * Constructs a new {@link ReaderInputStream}.
     *
     * <p>
     * This constructor does not call {@link CharsetEncoder#reset() reset} on the provided encoder. The caller of this constructor should do this when providing
     * an encoder which had already been in use.
     * </p>
     *
     * @param reader         the target {@link Reader}
     * @param charsetEncoder the charset encoder
     * @since 2.1
     * @deprecated Use {@link ReaderInputStream#builder()} instead
     */
    @Deprecated
    public ReaderInputStream(final Reader reader, final CharsetEncoder charsetEncoder) {
        this(reader, charsetEncoder, IOUtils.DEFAULT_BUFFER_SIZE);
    }

    /**
     * Constructs a new {@link ReaderInputStream}.
     *
     * <p>
     * This constructor does not call {@link CharsetEncoder#reset() reset} on the provided encoder. The caller of this constructor should do this when providing
     * an encoder which had already been in use.
     * </p>
     *
     * @param reader         the target {@link Reader}
     * @param charsetEncoder the charset encoder, null defaults to the default Charset encoder.
     * @param bufferSize     the size of the input buffer in number of characters
     * @since 2.1
     * @deprecated Use {@link ReaderInputStream#builder()} instead
     */
    @Deprecated
    public ReaderInputStream(final Reader reader, final CharsetEncoder charsetEncoder, final int bufferSize) {
        this.reader = reader;
        this.charsetEncoder = CharsetEncoders.toCharsetEncoder(charsetEncoder);
        this.encoderIn = CharBuffer.allocate(checkMinBufferSize(this.charsetEncoder, bufferSize));
        this.encoderIn.flip();
        this.encoderOut = ByteBuffer.allocate(128);
        this.encoderOut.flip();
    }

    /**
     * Constructs a new {@link ReaderInputStream} with a default input buffer size of {@value IOUtils#DEFAULT_BUFFER_SIZE} characters.
     *
     * <p>
     * The encoder created for the specified charset will use {@link CodingErrorAction#REPLACE} for malformed input and unmappable characters.
     * </p>
     *
     * @param reader      the target {@link Reader}
     * @param charsetName the name of the charset encoding
     * @deprecated Use {@link ReaderInputStream#builder()} instead
     */
    @Deprecated
    public ReaderInputStream(final Reader reader, final String charsetName) {
        this(reader, charsetName, IOUtils.DEFAULT_BUFFER_SIZE);
    }

    /**
     * Constructs a new {@link ReaderInputStream}.
     *
     * <p>
     * The encoder created for the specified charset will use {@link CodingErrorAction#REPLACE} for malformed input and unmappable characters.
     * </p>
     *
     * @param reader      the target {@link Reader}
     * @param charsetName the name of the charset encoding, null maps to the default Charset.
     * @param bufferSize  the size of the input buffer in number of characters
     * @deprecated Use {@link ReaderInputStream#builder()} instead
     */
    @Deprecated
    public ReaderInputStream(final Reader reader, final String charsetName, final int bufferSize) {
        this(reader, Charsets.toCharset(charsetName), bufferSize);
    }

    /**
     * Closes the stream. This method will cause the underlying {@link Reader} to be closed.
     *
     * @throws IOException if an I/O error occurs.
     */
    @Override
    public void close() throws IOException {
        reader.close();
    }

    /**
     * Fills the internal char buffer from the reader.
     *
     * @throws IOException If an I/O error occurs
     */
    private void fillBuffer() throws IOException {
        if (!endOfInput && (lastCoderResult == null || lastCoderResult.isUnderflow())) {
            encoderIn.compact();
            final int position = encoderIn.position();
            // We don't use Reader#read(CharBuffer) here because it is more efficient
            // to write directly to the underlying char array (the default implementation
            // copies data to a temporary char array).
            final int c = reader.read(encoderIn.array(), position, encoderIn.remaining());
            if (c == EOF) {
                endOfInput = true;
            } else {
                encoderIn.position(position + c);
            }
            encoderIn.flip();
        }
        encoderOut.compact();
        lastCoderResult = charsetEncoder.encode(encoderIn, encoderOut, endOfInput);
        if (endOfInput) {
            lastCoderResult = charsetEncoder.flush(encoderOut);
        }
        if (lastCoderResult.isError()) {
            lastCoderResult.throwException();
        }
        encoderOut.flip();
    }

    /**
     * Gets the CharsetEncoder.
     *
     * @return the CharsetEncoder.
     */
    CharsetEncoder getCharsetEncoder() {
        return charsetEncoder;
    }

    /**
     * Reads a single byte.
     *
     * @return either the byte read or {@code -1} if the end of the stream has been reached
     * @throws IOException if an I/O error occurs.
     */
    @Override
    public int read() throws IOException {
        for (;;) {
            if (encoderOut.hasRemaining()) {
                return encoderOut.get() & 0xFF;
            }
            fillBuffer();
            if (endOfInput && !encoderOut.hasRemaining()) {
                return EOF;
            }
        }
    }

    /**
     * Reads the specified number of bytes into an array.
     *
     * @param b the byte array to read into
     * @return the number of bytes read or {@code -1} if the end of the stream has been reached
     * @throws IOException if an I/O error occurs.
     */
    @Override
    public int read(final byte[] b) throws IOException {
        return read(b, 0, b.length);
    }

    /**
     * Reads the specified number of bytes into an array.
     *
     * @param array the byte array to read into
     * @param off   the offset to start reading bytes into
     * @param len   the number of bytes to read
     * @return the number of bytes read or {@code -1} if the end of the stream has been reached
     * @throws IOException if an I/O error occurs.
     */
    @Override
    public int read(final byte[] array, int off, int len) throws IOException {
        Objects.requireNonNull(array, "array");
        if (len < 0 || off < 0 || off + len > array.length) {
            throw new IndexOutOfBoundsException("Array size=" + array.length + ", offset=" + off + ", length=" + len);
        }
        int read = 0;
        if (len == 0) {
            return 0; // Always return 0 if len == 0
        }
        while (len > 0) {
            if (encoderOut.hasRemaining()) { // Data from the last read not fully copied
                final int c = Math.min(encoderOut.remaining(), len);
                encoderOut.get(array, off, c);
                off += c;
                len -= c;
                read += c;
            } else if (endOfInput) { // Already reach EOF in the last read
                break;
            } else { // Read again
                fillBuffer();
            }
        }
        return read == 0 && endOfInput ? EOF : read;
    }
}
ReaderInputStream.java:140

The provided code is expected to pass these test cases:

@Test
    public void testResetCharsetEncoder() {
        assertNotNull(ReaderInputStream.builder().setReader(new StringReader("\uD800")).setCharsetEncoder(null).getCharsetEncoder());
    }

Faulty Code:

/*
 * Licensed to the Apache Software Foundation (ASF) under one or more
 * contributor license agreements.  See the NOTICE file distributed with
 * this work for additional information regarding copyright ownership.
 * The ASF licenses this file to You under the Apache License, Version 2.0
 * (the "License"); you may not use this file except in compliance with
 * the License.  You may obtain a copy of the License at
 *
 *      http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */
package org.apache.commons.io.input;

import static org.apache.commons.io.IOUtils.EOF;

import java.io.IOException;
import java.io.InputStream;
import java.io.Reader;
import java.nio.ByteBuffer;
import java.nio.CharBuffer;
import java.nio.charset.Charset;
import java.nio.charset.CharsetEncoder;
import java.nio.charset.CoderResult;
import java.nio.charset.CodingErrorAction;
import java.util.Objects;

import org.apache.commons.io.Charsets;
import org.apache.commons.io.IOUtils;
import org.apache.commons.io.build.AbstractOrigin;
import org.apache.commons.io.build.AbstractStreamBuilder;
import org.apache.commons.io.charset.CharsetEncoders;

/**
 * {@link InputStream} implementation that reads a character stream from a {@link Reader} and transforms it to a byte stream using a specified charset encoding.
 * The stream is transformed using a {@link CharsetEncoder} object, guaranteeing that all charset encodings supported by the JRE are handled correctly. In
 * particular for charsets such as UTF-16, the implementation ensures that one and only one byte order marker is produced.
 * <p>
 * Since in general it is not possible to predict the number of characters to be read from the {@link Reader} to satisfy a read request on the
 * {@link ReaderInputStream}, all reads from the {@link Reader} are buffered. There is therefore no well defined correlation between the current position of the
 * {@link Reader} and that of the {@link ReaderInputStream}. This also implies that in general there is no need to wrap the underlying {@link Reader} in a
 * {@link java.io.BufferedReader}.
 * </p>
 * <p>
 * {@link ReaderInputStream} implements the inverse transformation of {@link java.io.InputStreamReader}; in the following example, reading from {@code in2}
 * would return the same byte sequence as reading from {@code in} (provided that the initial byte sequence is legal with respect to the charset encoding):
 * </p>
 * <p>
 * To build an instance, see {@link Builder}.
 * </p>
 * <pre>
 * InputStream inputStream = ...
 * Charset cs = ...
 * InputStreamReader reader = new InputStreamReader(inputStream, cs);
 * ReaderInputStream in2 = ReaderInputStream.builder()
 *   .setReader(reader)
 *   .setCharset(cs)
 *   .get();
 * </pre>
 * <p>
 * {@link ReaderInputStream} implements the same transformation as {@link java.io.OutputStreamWriter}, except that the control flow is reversed: both classes
 * transform a character stream into a byte stream, but {@link java.io.OutputStreamWriter} pushes data to the underlying stream, while {@link ReaderInputStream}
 * pulls it from the underlying stream.
 * </p>
 * <p>
 * Note that while there are use cases where there is no alternative to using this class, very often the need to use this class is an indication of a flaw in
 * the design of the code. This class is typically used in situations where an existing API only accepts an {@link InputStream}, but where the most natural way
 * to produce the data is as a character stream, i.e. by providing a {@link Reader} instance. An example of a situation where this problem may appear is when
 * implementing the {@code javax.activation.DataSource} interface from the Java Activation Framework.
 * </p>
 * <p>
 * The {@link #available()} method of this class always returns 0. The methods {@link #mark(int)} and {@link #reset()} are not supported.
 * </p>
 * <p>
 * Instances of {@link ReaderInputStream} are not thread safe.
 * </p>
 *
 * @see org.apache.commons.io.output.WriterOutputStream
 * @since 2.0
 */
public class ReaderInputStream extends InputStream {

    /**
     * Builds a new {@link ReaderInputStream} instance.
     * <p>
     * For example:
     * </p>
     * <pre>{@code
     * ReaderInputStream s = ReaderInputStream.builder()
     *   .setPath(path)
     *   .setCharsetEncoder(Charset.defaultCharset().newEncoder())
     *   .get();}
     * </pre>
     *
     * @since 2.12.0
     */
    public static class Builder extends AbstractStreamBuilder<ReaderInputStream, Builder> {

        private CharsetEncoder charsetEncoder = super.getCharset().newEncoder();

        /**
         * Constructs a new instance.
         * <p>
         * This builder use the aspects Reader, Charset, CharsetEncoder, buffer size.
         * </p>
         * <p>
         * You must provide an origin that can be converted to a Reader by this builder, otherwise, this call will throw an
         * {@link UnsupportedOperationException}.
         * </p>
         *
         * @return a new instance.
         * @throws UnsupportedOperationException if the origin cannot provide a Reader.
         * @throws IllegalStateException if the {@code origin} is {@code null}.
         * @see AbstractOrigin#getReader(Charset)
         */
        @SuppressWarnings("resource")
        @Override
        public ReaderInputStream get() throws IOException {
            return new ReaderInputStream(checkOrigin().getReader(getCharset()), charsetEncoder, getBufferSize());
        }

        @Override
        public Builder setCharset(final Charset charset) {
            charsetEncoder = charset.newEncoder();
            return super.setCharset(charset);
        }

        /**
         * Sets the charset encoder.
         *
         * @param charsetEncoder the charset encoder.
         * @return this
         */
        public Builder setCharsetEncoder(final CharsetEncoder charsetEncoder) {
            this.charsetEncoder = charsetEncoder;
            super.setCharset(charsetEncoder.charset());
            return this;
        }

    }

    /**
     * Constructs a new {@link Builder}.
     *
     * @return a new {@link Builder}.
     * @since 2.12.0
     */
    public static Builder builder() {
        return new Builder();
    }

    static int checkMinBufferSize(final CharsetEncoder charsetEncoder, final int bufferSize) {
        final float minRequired = minBufferSize(charsetEncoder);
        if (bufferSize < minRequired) {
            throw new IllegalArgumentException(String.format("Buffer size %,d must be at least %s for a CharsetEncoder %s.", bufferSize, minRequired,
                    charsetEncoder.charset().displayName()));
        }
        return bufferSize;
    }

    static float minBufferSize(final CharsetEncoder charsetEncoder) {
        return charsetEncoder.maxBytesPerChar() * 2;
    }

    private final Reader reader;

    private final CharsetEncoder charsetEncoder;

    /**
     * CharBuffer used as input for the decoder. It should be reasonably large as we read data from the underlying Reader into this buffer.
     */
    private final CharBuffer encoderIn;
    /**
     * ByteBuffer used as output for the decoder. This buffer can be small as it is only used to transfer data from the decoder to the buffer provided by the
     * caller.
     */
    private final ByteBuffer encoderOut;

    private CoderResult lastCoderResult;

    private boolean endOfInput;

    /**
     * Constructs a new {@link ReaderInputStream} that uses the default character encoding with a default input buffer size of
     * {@value IOUtils#DEFAULT_BUFFER_SIZE} characters.
     *
     * @param reader the target {@link Reader}
     * @deprecated Use {@link ReaderInputStream#builder()} instead
     */
    @Deprecated
    public ReaderInputStream(final Reader reader) {
        this(reader, Charset.defaultCharset());
    }

    /**
     * Constructs a new {@link ReaderInputStream} with a default input buffer size of {@value IOUtils#DEFAULT_BUFFER_SIZE} characters.
     *
     * <p>
     * The encoder created for the specified charset will use {@link CodingErrorAction#REPLACE} for malformed input and unmappable characters.
     * </p>
     *
     * @param reader  the target {@link Reader}
     * @param charset the charset encoding
     * @deprecated Use {@link ReaderInputStream#builder()} instead, will be protected for subclasses.
     */
    @Deprecated
    public ReaderInputStream(final Reader reader, final Charset charset) {
        this(reader, charset, IOUtils.DEFAULT_BUFFER_SIZE);
    }

    /**
     * Constructs a new {@link ReaderInputStream}.
     *
     * <p>
     * The encoder created for the specified charset will use {@link CodingErrorAction#REPLACE} for malformed input and unmappable characters.
     * </p>
     *
     * @param reader     the target {@link Reader}.
     * @param charset    the charset encoding.
     * @param bufferSize the size of the input buffer in number of characters.
     * @deprecated Use {@link ReaderInputStream#builder()} instead
     */
    @Deprecated
    public ReaderInputStream(final Reader reader, final Charset charset, final int bufferSize) {
        // @formatter:off
        this(reader,
                Charsets.toCharset(charset).newEncoder()
                        .onMalformedInput(CodingErrorAction.REPLACE)
                        .onUnmappableCharacter(CodingErrorAction.REPLACE),
                bufferSize);
        // @formatter:on
    }

    /**
     * Constructs a new {@link ReaderInputStream}.
     *
     * <p>
     * This constructor does not call {@link CharsetEncoder#reset() reset} on the provided encoder. The caller of this constructor should do this when providing
     * an encoder which had already been in use.
     * </p>
     *
     * @param reader         the target {@link Reader}
     * @param charsetEncoder the charset encoder
     * @since 2.1
     * @deprecated Use {@link ReaderInputStream#builder()} instead
     */
    @Deprecated
    public ReaderInputStream(final Reader reader, final CharsetEncoder charsetEncoder) {
        this(reader, charsetEncoder, IOUtils.DEFAULT_BUFFER_SIZE);
    }

    /**
     * Constructs a new {@link ReaderInputStream}.
     *
     * <p>
     * This constructor does not call {@link CharsetEncoder#reset() reset} on the provided encoder. The caller of this constructor should do this when providing
     * an encoder which had already been in use.
     * </p>
     *
     * @param reader         the target {@link Reader}
     * @param charsetEncoder the charset encoder, null defaults to the default Charset encoder.
     * @param bufferSize     the size of the input buffer in number of characters
     * @since 2.1
     * @deprecated Use {@link ReaderInputStream#builder()} instead
     */
    @Deprecated
    public ReaderInputStream(final Reader reader, final CharsetEncoder charsetEncoder, final int bufferSize) {
        this.reader = reader;
        this.charsetEncoder = CharsetEncoders.toCharsetEncoder(charsetEncoder);
        this.encoderIn = CharBuffer.allocate(checkMinBufferSize(this.charsetEncoder, bufferSize));
        this.encoderIn.flip();
        this.encoderOut = ByteBuffer.allocate(128);
        this.encoderOut.flip();
    }

    /**
     * Constructs a new {@link ReaderInputStream} with a default input buffer size of {@value IOUtils#DEFAULT_BUFFER_SIZE} characters.
     *
     * <p>
     * The encoder created for the specified charset will use {@link CodingErrorAction#REPLACE} for malformed input and unmappable characters.
     * </p>
     *
     * @param reader      the target {@link Reader}
     * @param charsetName the name of the charset encoding
     * @deprecated Use {@link ReaderInputStream#builder()} instead
     */
    @Deprecated
    public ReaderInputStream(final Reader reader, final String charsetName) {
        this(reader, charsetName, IOUtils.DEFAULT_BUFFER_SIZE);
    }

    /**
     * Constructs a new {@link ReaderInputStream}.
     *
     * <p>
     * The encoder created for the specified charset will use {@link CodingErrorAction#REPLACE} for malformed input and unmappable characters.
     * </p>
     *
     * @param reader      the target {@link Reader}
     * @param charsetName the name of the charset encoding, null maps to the default Charset.
     * @param bufferSize  the size of the input buffer in number of characters
     * @deprecated Use {@link ReaderInputStream#builder()} instead
     */
    @Deprecated
    public ReaderInputStream(final Reader reader, final String charsetName, final int bufferSize) {
        this(reader, Charsets.toCharset(charsetName), bufferSize);
    }

    /**
     * Closes the stream. This method will cause the underlying {@link Reader} to be closed.
     *
     * @throws IOException if an I/O error occurs.
     */
    @Override
    public void close() throws IOException {
        reader.close();
    }

    /**
     * Fills the internal char buffer from the reader.
     *
     * @throws IOException If an I/O error occurs
     */
    private void fillBuffer() throws IOException {
        if (!endOfInput && (lastCoderResult == null || lastCoderResult.isUnderflow())) {
            encoderIn.compact();
            final int position = encoderIn.position();
            // We don't use Reader#read(CharBuffer) here because it is more efficient
            // to write directly to the underlying char array (the default implementation
            // copies data to a temporary char array).
            final int c = reader.read(encoderIn.array(), position, encoderIn.remaining());
            if (c == EOF) {
                endOfInput = true;
            } else {
                encoderIn.position(position + c);
            }
            encoderIn.flip();
        }
        encoderOut.compact();
        lastCoderResult = charsetEncoder.encode(encoderIn, encoderOut, endOfInput);
        if (endOfInput) {
            lastCoderResult = charsetEncoder.flush(encoderOut);
        }
        if (lastCoderResult.isError()) {
            lastCoderResult.throwException();
        }
        encoderOut.flip();
    }

    /**
     * Gets the CharsetEncoder.
     *
     * @return the CharsetEncoder.
     */
    CharsetEncoder getCharsetEncoder() {
        return charsetEncoder;
    }

    /**
     * Reads a single byte.
     *
     * @return either the byte read or {@code -1} if the end of the stream has been reached
     * @throws IOException if an I/O error occurs.
     */
    @Override
    public int read() throws IOException {
        for (;;) {
            if (encoderOut.hasRemaining()) {
                return encoderOut.get() & 0xFF;
            }
            fillBuffer();
            if (endOfInput && !encoderOut.hasRemaining()) {
                return EOF;
            }
        }
    }

    /**
     * Reads the specified number of bytes into an array.
     *
     * @param b the byte array to read into
     * @return the number of bytes read or {@code -1} if the end of the stream has been reached
     * @throws IOException if an I/O error occurs.
     */
    @Override
    public int read(final byte[] b) throws IOException {
        return read(b, 0, b.length);
    }

    /**
     * Reads the specified number of bytes into an array.
     *
     * @param array the byte array to read into
     * @param off   the offset to start reading bytes into
     * @param len   the number of bytes to read
     * @return the number of bytes read or {@code -1} if the end of the stream has been reached
     * @throws IOException if an I/O error occurs.
     */
    @Override
    public int read(final byte[] array, int off, int len) throws IOException {
        Objects.requireNonNull(array, "array");
        if (len < 0 || off < 0 || off + len > array.length) {
            throw new IndexOutOfBoundsException("Array size=" + array.length + ", offset=" + off + ", length=" + len);
        }
        int read = 0;
        if (len == 0) {
            return 0; // Always return 0 if len == 0
        }
        while (len > 0) {
            if (encoderOut.hasRemaining()) { // Data from the last read not fully copied
                final int c = Math.min(encoderOut.remaining(), len);
                encoderOut.get(array, off, c);
                off += c;
                len -= c;
                read += c;
            } else if (endOfInput) { // Already reach EOF in the last read
                break;
            } else { // Read again
                fillBuffer();
            }
        }
        return read == 0 && endOfInput ? EOF : read;
    }
}

Task Description:
The provided code defines a class ReaderInputStream which extends InputStream. This class is part of the Apache Commons IO library and provides an implementation that reads a character stream from a Reader and converts it into a byte stream using a specified charset encoding via a CharsetEncoder.

Here's a breakdown of the key components:

Class and Package Definition:

ReaderInputStream is located in the org.apache.commons.io.input package.
Imports:

The code imports various classes related to IO operations, charset handling, and utility functions from the Apache Commons IO library.
Class-level Documentation:

The class-level Javadoc explains the purpose of ReaderInputStream, which is to transform a character stream into a byte stream using a specified charset encoding.
It provides examples of usage and clarifies that the class is the inverse of InputStreamReader.
Builder Class:

The nested Builder class allows for the construction of ReaderInputStream instances using a fluent interface.
The builder supports setting the reader, charset, charset encoder, and buffer size.
Instance Variables:

reader: The source Reader from which characters are read.
charsetEncoder: The CharsetEncoder used to encode characters into bytes.
encoderIn: A CharBuffer used as input for the encoder.
encoderOut: A ByteBuffer used as output for the encoder.
lastCoderResult and endOfInput: Used to keep track of the encoding process and end-of-input status.
Constructors:

Several constructors are provided (deprecated in favor of using the builder) to initialize the class with different configurations of reader, charset, and buffer size.
Methods:

builder(): Returns a new instance of the Builder.
checkMinBufferSize(): Ensures the buffer size is sufficient for the given charset encoder.
close(): Closes the underlying reader.
fillBuffer(): Fills the internal char buffer from the reader, encoding characters into bytes.
read(): Reads a single byte from the stream.
read(byte[] b): Reads bytes into the provided byte array.
read(byte[] array, int off, int len): Reads bytes into a portion of the provided byte array.
Test Cases:

The provided code is expected to pass test cases, such as testResetCharsetEncoder, which validates that a charset encoder is correctly set when using the builder.
Overall, the ReaderInputStream class is designed to handle the transformation of character streams to byte streams, accommodating various charset encodings and ensuring correct handling of character to byte conversions, including edge cases such as UTF-16 encoding with byte order markers. The builder pattern simplifies the creation and configuration of ReaderInputStream instances.

Test Results:
The provided code is producing incorrect results. For examples:

Running the function as follows assertNotNull(ReaderInputStream.builder().setReader(new StringReader("\uD800")).setCharsetEncoder(null).getCharsetEncoder()); generate a NullPointerException in line 2

    @Test
    public void testResetCharsetEncoder() {
        assertNotNull(ReaderInputStream.builder().setReader(new StringReader("\uD800")).setCharsetEncoder(null).getCharsetEncoder());
    }

Spectrum-based Fault Localization (SBFL) Techniques Results:
We have used SBFL techniques to identify potentially faulty lines. Here are the results of top-[X] suspicious lines sorted in descending order of the score:
1. Line 140 super.setCharset(charsetEncoder.charset());, SBFL
...
Analyze the provided code and utilize task description, test results, and SBFL techniques results to help in identifying potentially faulty lines. Provide the results following this JSON template:

{
  "faultLoc": [    {      "faultyLine": (indicating the line number of the suspicious code),      "code": (displaying the actual code),      "explanation": (step by step reasoning on why this location is considered potentially faulty)    },    ...  ]
}

-----------------------------------------응답--------------------------------------------

{
  "faultLoc": [
    {
      "faultyLine": 140,
      "code": "super.setCharset(charsetEncoder.charset());",
      "explanation": "The call to super.setCharset(charsetEncoder.charset()) assumes charsetEncoder is non-null. When charsetEncoder is null, this results in a NullPointerException. The test case intentionally sets charsetEncoder to null, triggering this issue. The code should check for null before using charsetEncoder."
    },
    {
      "faultyLine": 125,
      "code": "private CharsetEncoder charsetEncoder = super.getCharset().newEncoder();",
      "explanation": "The initialization of charsetEncoder assumes that super.getCharset() is non-null and can successfully create a new CharsetEncoder. If super.getCharset() is null, this will cause a NullPointerException. The code should ensure super.getCharset() is not null before using it."
    },
    {
      "faultyLine": 267,
      "code": "this.charsetEncoder = CharsetEncoders.toCharsetEncoder(charsetEncoder);",
      "explanation": "The call to CharsetEncoders.toCharsetEncoder(charsetEncoder) does not handle the case where charsetEncoder is null. This method should handle null values gracefully, perhaps by assigning a default CharsetEncoder."
    },
    {
      "faultyLine": 280,
      "code": "private void fillBuffer() throws IOException {",
      "explanation": "The fillBuffer method relies on charsetEncoder, which can be null if not properly initialized or set. This method should include a check for null charsetEncoder to avoid potential NullPointerExceptions."
    }
  ]
}

Explanation
Line 140 (super.setCharset(charsetEncoder.charset());):

This line directly uses charsetEncoder without checking if it is null. When charsetEncoder is null, calling .charset() on it will cause a NullPointerException.
Line 125 (private CharsetEncoder charsetEncoder = super.getCharset().newEncoder();):

This line initializes charsetEncoder assuming super.getCharset() is not null. If super.getCharset() is null, this initialization will fail, leading to a NullPointerException.
Line 267 (this.charsetEncoder = CharsetEncoders.toCharsetEncoder(charsetEncoder);):

The method CharsetEncoders.toCharsetEncoder(charsetEncoder) should handle the case where charsetEncoder is null. Without this check, assigning a null charsetEncoder will lead to issues later in the code.
Line 280 (private void fillBuffer() throws IOException {):

The fillBuffer method operates on charsetEncoder, which might be null if not correctly initialized. Ensuring that charsetEncoder is not null within this method is crucial to prevent NullPointerExceptions during buffer operations.

준혁

멍청한 나를 위해 쉽게 풀어쓴 개발 기술블로그

이전 포스트

lang_npe_1 두번째 클래스

다음 포스트

commons_io_npe_7

lang_npe_1 두번째 클래스

commons-io_npe-5 (FuseFL)

0개의 댓글