Class MimeTypesUtils


  • public class MimeTypesUtils
    extends Object
    This class defines static helper methods to deal with MIME types.

    Security note: methods in this class fall into two categories:

    • Extension-based detection (getContentType(String), getExtensionContentType(String)): rely solely on the file name or extension. These must not be used for security validation, as a malicious user can rename any file to an allowed extension.
    • Content-based detection (getContentType(File, String), getContentType(InputStream, String)): inspect the file's magic bytes via Apache Tika. When a fileName is provided, it is used as a hint that may influence the result when content detection is ambiguous. To guarantee purely content-based detection (e.g. for security validation), pass null as fileName so that Tika relies exclusively on magic bytes.
    • Constructor Detail

      • MimeTypesUtils

        public MimeTypesUtils()
    • Method Detail

      • getContentType

        public static String getContentType​(File file)
        Returns the content type from the file, using both its content (magic bytes) and its name as a hint.

        Security note: the file name is passed as a hint to Tika and may influence the result when content detection is ambiguous. For security validation, use getContentType(File, String) with null as the file name to rely solely on magic bytes.

        Parameters:
        file - the file
        Returns:
        detected content type if it is a supported format or "application/octet-stream" if it is an unsupported format
      • getContentType

        public static String getContentType​(Path path)
        Returns the content type from the path, using both its content (magic bytes) and its name as a hint.

        Security note: the file name is passed as a hint to Tika and may influence the result when content detection is ambiguous. For security validation, use getContentType(File, String) with null as the file name to rely solely on magic bytes.

        Parameters:
        path - the path
        Returns:
        detected content type if it is a supported format or "application/octet-stream" if it is an unsupported format
      • getContentType

        public static String getContentType​(String fileName)
        Detects the content type of the given file name. The type detection is based solely on known file name extensions, without inspecting file content.

        Security note: this method must not be used for security validation, as a malicious user can rename any file to an allowed extension.

        Parameters:
        fileName - the file name
        Returns:
        detected content type if it is a supported format or "application/octet-stream" if it is an unsupported format
      • getExtensionContentType

        public static String getExtensionContentType​(String extension)
        Returns the content type from the file extension, without inspecting file content.

        Security note: this method must not be used for security validation, as a malicious user can rename any file to an allowed extension.

        Parameters:
        extension - the extension of the file (e.g., "doc")
        Returns:
        detected content type if it is a supported format or "application/octet-stream" if it is an unsupported format
      • getContentType

        public static String getContentType​(File file,
                                            String fileName)
        Returns the content type from the file content (magic bytes), optionally using the file name as a hint.

        Security note: if fileName is non-null and non-blank, it is passed to Tika as a hint and may influence the result when content detection is ambiguous. To guarantee purely content-based detection (e.g. for security validation), pass null as fileName.

        Parameters:
        file - the file, can be null
        fileName - the file name hint (e.g., "doc"), or null for content-only detection
        Returns:
        detected content type if it is a supported format or "application/octet-stream" if it is an unsupported format
      • getContentType

        public static String getContentType​(InputStream inputStream,
                                            String fileName)
        Detects the content type based on the input stream and an optional file name hint.

        This method is designed to be safe for all stream types. It guarantees that the provided InputStream is never consumed or corrupted.

        If the stream supports mark/reset (e.g., BufferedInputStream), this method will peek at the file header to determine the MIME type accurately, and then reset the stream to its original position.

        If the stream is null or does not support mark/reset (e.g., a raw FileInputStream), content inspection is skipped to avoid consuming data. In this case, detection relies solely on the provided fileName extension.

        Security note: if fileName is non-null and non-blank, it is passed to Tika as a hint and may influence the result when content detection is ambiguous. To guarantee purely content-based detection (e.g. for security validation), pass null as fileName and ensure the stream supports mark/reset (e.g. wrap it in a BufferedInputStream or use a ByteArrayInputStream).

        Parameters:
        inputStream - the input stream to inspect (can be null or raw)
        fileName - the file name hint (e.g., "doc"), or null for content-only detection
        Returns:
        the detected content type, or "application/octet-stream" if detection fails