Introduction


Imagine, that in 15 minutes after reading this text you will be able to provide API for your application to convert HTML documents into DOCX format using C#, VB.NET or other .NET language. This solution can be used both in .NET Core and .NET Framework.

SautinSoft.HtmlToRtf h = new SautinSoft.HtmlToRtf();
		  
string htmlString = "Hello World!";
h.OpenHtml(htmlString);
byte[] docxBytes = h.ToDocx();

string outputFile = @"c:\Test\result.docx";
if (docxBytes != null)
{
    File.WriteAllBytes(outputFile, docxBytes);
    System.Diagnostics.Process.Start(new System.Diagnostics.ProcessStartInfo(outputFile) {
                        UseShellExecute = true });
}

Despite of the component names «HTML to RTF .Net», it can simultaneously convert to DOCX format completely supporting the Office Open XML specification. And we've decided to not change the component's name.

Now you can operate with HtmlToRtf class which provides you by various methods and properties to convert HTML to DOCX:

Using the both uploaded methods OpenHtml() and ToDocx() you can transform HTML documents to DOCX format as MemoryStream, Files, URI or Bytes Array.

Download


To verify the functionality of our SDK, download the latest «HTML to RTF .Net» with code examples, 32.6 Mb.

Restrictions:

Free version of «HTML to RTF .Net» has a notification "Created by an unlicensed version of «HTML to RTF .Net»" and the random addition of the word "TRIAL VERSION".

Various examples to convert HTML to DOCX in C# and VB.NET

1. Simple conversion of HTML file to DOCX file in C#:

SautinSoft.HtmlToRtf h = new SautinSoft.HtmlToRtf();
			  
string inputFile = @"d:\sample.html";
string outputFile = Path.ChangeExtension(inputFile, ".docx");

if (h.OpenHtml(inputFile))
{
    bool ok = h.ToDocx(outputFile);
}

2. Convert HTML Stream to DOCX Stream in C#:

SautinSoft.HtmlToRtf h = new SautinSoft.HtmlToRtf();
			  
string inputFile = @"d:\utf-8.html";
string outputFile = Path.ChangeExtension(inputFile, ".docx");

// Specify the 'BaseURL' property that component can find the full path to images, like a: < img src="..\pict.png" and
// to external css, like a:  < link rel="stylesheet" href="/css/style.css">.
h.BaseURL = Path.GetFullPath(inputFile);

using (FileStream htmlFileStream = new FileStream(inputFile, FileMode.Open))
{
    if (h.OpenHtml(htmlFileStream))
    {
        using (MemoryStream docxMemoryStream = new MemoryStream())
        {
            bool ok = h.ToDocx(docxMemoryStream);
        }
    }
}

3. Convert HTML to DOCX in memory using VB.NET:

Dim h As New SautinSoft.HtmlToRtf();
			  
Dim inputFile As String = "d:\pic.html"
Dim outputFile As String = Path.ChangeExtension(inputFile, ".docx")

' Read our HTML file a bytes.
Dim htmlBytes() As Byte = File.ReadAllBytes(inputFile)

' Specify the 'BaseURL' property that component can find the full path to images, like a: <img src="..\pict.png" and
' to external css, like a:  <link rel="stylesheet" href="/css/style.css">.
h.BaseURL = Path.GetFullPath(inputFile)

If h.OpenHtml(htmlBytes) Then
    Dim docxBytes() As Byte = h.ToDocx()

    ' Open the result for demonstation purposes.
    If docxBytes IsNot Nothing Then
        File.WriteAllBytes(outputFile, docxBytes)
        System.Diagnostics.Process.Start(New System.Diagnostics.ProcessStartInfo(outputFile)
              With {.UseShellExecute = True})
    End If
End If

4. Convert HTML to DOCX in C#; Add a custom page header from HTML, add footer from another RTF:

SautinSoft.HtmlToRtf h = new SautinSoft.HtmlToRtf();

string inputFile = @"d:\document.html";
string outputFile = Path.ChangeExtension(inputFile, ".docx");

// Set page header and footer.
string headerFromHtml = File.ReadAllText(@"d:\header.html");
string footerFromRtf = File.ReadAllText(@"d:\footer.rtf");

// Add page header.
h.PageStyle.PageHeader.Html(headerFromHtml);

// Add extra space between header and page contents.
h.PageStyle.PageHeader.MarginBottom.Mm(10);

// Add page footer.
h.PageStyle.PageFooter.Rtf(footerFromRtf);

if (h.OpenHtml(inputFile))
{
    bool ok = h.ToDocx(outputFile);
}

5. Add page numbering during to HTML to DOCX conversion C#:

SautinSoft.HtmlToRtf h = new SautinSoft.HtmlToRtf();

string inputFile = @"..\..\sample.html";
string outputFile = Path.ChangeExtension(inputFile, ".docx");

// Add page numbering.
// Let's set page numbers from 1st page
h.PageStyle.PageNumbers.Appearance = SautinSoft.HtmlToRtf.ePageNumberingAppearence.PageNumFirst;

// Lest's align page numbers by top-center
h.PageStyle.PageNumbers.AlignV = SautinSoft.HtmlToRtf.eAlign.Top;
h.PageStyle.PageNumbers.AlignH = SautinSoft.HtmlToRtf.eAlign.Center;

// Lest's set page numbers format as "Page 1 of 20".
h.PageStyle.PageNumbers.Format = "Page {page} of {numpages}";

// Set page numbers font: Calibry, 19.
h.PageStyle.PageNumbers.Font.Face = SautinSoft.HtmlToRtf.eFontFace.f_Calibri;
h.PageStyle.PageNumbers.Font.Size = 19;


if (h.OpenHtml(inputFile))
{
    bool ok = h.ToDocx(outputFile);
}

Technical information and requirements


Requires only .NET Framework 4.0 and up or .NET Core 2.0 and up. Our product is compatible with all languages .NET and supports all Operating Systems where .NET Framework and .NET Core can be used.

Note, that «HTML to RTF .Net» is entirely written in managed C#, which makes it absolutely standalone and an independent library.

.NET Framework, .NET Core
  • .NET Framework 4.0, 4.5, 4.6.1 and higher.
  • .NET Standard 2.0
  • .NET Core 2.0 and higher.

Multi-platform component, runs on:

  • Windows
  • Linux
  • Mac OS
WindowsLinuxMac OS

Our component has proven itself on cloud platforms and services:

SharePoint Google Cloud AWS Microsoft Azure Docker
  • SharePoint
  • Google Cloud Platform
  • Amazon Web Services (AWS)
  • Microsoft Azure
  • Docker etc.