TBEG Tips & Troubleshooting - Performance Optimization to Error Resolution

황용호·2026년 3월 19일

@dataProvider Best Practices JVM Java Performance Spring boot TBEG TROUBLESHOOTING apache poi excel kotlin large-scale library maven central open source optimization report streaming template tips

TBEG - Template Based Excel Report Generator

목록 보기

6/6

🇰🇷 한국어 버전

What This Post Covers

When you adopt TBEG in a real project, you'll encounter various situations. This post covers common problems and solutions, along with tips for better results.

Template Design Tips -- Building maintainable templates
Performance Optimization -- 3-step guide by data size
Common Errors & Solutions -- Diagnosing by symptom
Hideable Issues -- Selective field visibility troubleshooting
Production Guide -- Memory sizing, concurrent requests

1. Template Design Tips

Place repeat markers outside the repeat range

${repeat(...)} markers can be placed anywhere in the workbook. Placing them in the row above the data area improves readability.

	A	B	C
1	${repeat(employees, A2:C2, emp)}
2	${emp.name}	${emp.position}	${emp.salary}

Place formulas below the data area

Formula ranges are automatically expanded regardless of position. However, placing them below follows the natural data → aggregate order for better readability in Excel.

	A	B	C
1	${repeat(items, A2:C2, item)}
2	${item.name}	${item.value}	${item.qty}
3	Total	=SUM(B2:B2)	=AVERAGE(C2:C2)

All range-referencing formulas including SUM, AVERAGE, COUNT, MAX, and MIN are automatically adjusted.

One row per data item as a rule

One row per data item is the most intuitive approach. For complex layouts, use multi-row repeat (A2:B3).

Use hideable markers for fields that might need hiding

${hideable(emp.salary, C1:C3, dim)}

bundle range manages title and totals together
DIM mode: preserves layout + deactivation style
DELETE mode (default): physically removes the column

Avoid overlapping repeat regions

When placing multiple repeat regions on the same sheet, overlapping column ranges cause errors. Separate by columns or rows.

Full guide: Best Practices

2. Performance Optimization -- 3-Step Guide

Apply progressively based on data size.

Step 1: Provide count

Providing the total collection count prevents double data traversal.

val count = employeeRepository.count().toInt()

val provider = simpleDataProvider {
    items("employees", count) {
        employeeRepository.findAll().iterator()
    }
}

Step 2: Lazy loading

Don't preload all data into memory. Use lambdas to load at the point of need.

// Not recommended: preload all data
val allEmployees = employeeRepository.findAll()
items("employees", allEmployees)

// Recommended: lazy loading
items("employees", count) {
    employeeRepository.findAll().iterator()
}

Step 3: DB streaming

For 100K+ rows, use JPA Stream or MyBatis Cursor.

@Transactional(readOnly = true)
fun generateLargeReport(): Path {
    val count = employeeRepository.count().toInt()

    val provider = simpleDataProvider {
        items("employees", count) {
            employeeRepository.streamAll().iterator()
        }
    }

    return excelGenerator.generateToFile(
        template = template,
        dataProvider = provider,
        outputDir = outputDir,
        baseFileName = "large_report"
    )
}

Recommended Settings by Data Size

Data Size	Est. Time	Recommended Approach
~1,000 rows	~20ms	Map is sufficient
~10,000 rows	~110ms	simpleDataProvider + count
~50,000 rows	~500ms	+ DB Stream
~100,000 rows	~1s	+ DB Stream
~500,000 rows	~5s	Custom DataProvider + DB Stream + `generateToFile()`
~1,000,000 rows	~9s	Custom DataProvider + DB Stream + `generateToFile()`

Based on 3-column repeat + SUM formula (DataProvider + generateToFile). Varies by column count, formula complexity, and server specs.

3. Common Errors & Solutions

Markers remain in the output file

When ${title}, ${emp.name} etc. are not substituted:

Cause	Solution
Key missing from data	Add the variable to your data map or DataProvider
Variable name typo	Match marker names exactly with data keys
Repeat variable used outside range	`${emp.name}` etc. are only valid inside the repeat range

Tip: Set TbegConfig(missingDataBehavior = MissingDataBehavior.THROW) to get exceptions for missing data, making it easier to diagnose.

`TemplateProcessingException`

Thrown when syntax errors are found during template parsing.

ErrorType	Cause	Solution
`INVALID_MARKER_SYNTAX`	Marker syntax error	Check parentheses and parameter format
`MISSING_REQUIRED_PARAMETER`	Required param missing	Check `collection`, `range`, etc.
`INVALID_RANGE_FORMAT`	Invalid cell range	Use proper ranges like `A2:C2`
`RANGE_CONFLICT`	Range overlap or crossing	Separate overlapping ranges or adjust to full containment

`OutOfMemoryError`

Occurs when JVM memory is insufficient for large data processing.

Use lazy loading in DataProvider
Increase JVM heap: -Xmx2g
Split data into multiple files

Templates with pivot tables load the entire result file into memory during regeneration, with a practical limit of ~300K rows.

Merge results are not as expected

Merge only combines consecutive cells with the same value. Non-adjacent identical values become separate merge groups.

// Before sort: [Sales1, Sales2, Sales1] -> Sales1 split into 2 groups
// After sort: [Sales1, Sales1, Sales2] -> Sales1 merged into one
val employees = employeeRepository.findAll().sortedBy { it.department }

4. Hideable Issues

hideFields specified but fields are not hidden

Three checkpoints:

Field name mismatch: The field name in hideFields("employees", "salary") must match the template marker (${emp.salary}).
Collection name mismatch: The first argument must match the repeat's collection name.
Field outside repeat: hideFields only applies to repeat item fields.

Only data area is hidden, titles/totals remain

The hideable marker has no bundle parameter.

${hideable(emp.salary, C1:C4)}

Include the field title (C1), data (C2~C3), and total (C4) in the C1:C4 range to process them together.

Bundle range doesn't match the cell

DOWN repeat: The bundle's column range must match the hideable cell's column.
RIGHT repeat: The bundle's row range must match the hideable cell's row.

5. Production Guide

Memory Sizing

Data Scale	Recommended -Xmx	Notes
Under 10K rows	512MB	Most standard reports
100K rows	1~2GB
500K rows	4GB
1M rows	8GB

For concurrent report generation, multiply by the number of concurrent requests.

Concurrent Request Handling

ExcelGenerator is thread-safe, so use it as a singleton bean.

@Configuration
class TbegConfig {
    @Bean
    fun excelGenerator() = ExcelGenerator()
}

Dev/Prod Environment Separation

# application-dev.yml
tbeg:
  missing-data-behavior: throw

# application-prod.yml
tbeg:
  missing-data-behavior: warn

Use THROW in dev to catch missing data immediately, WARN in production.

JPA Stream Caution

JPA Streams close when the transaction ends. The transaction must remain active until Excel generation completes.

@Transactional(readOnly = true)  // Required!
fun generateReport(): ByteArray {
    val provider = simpleDataProvider {
        items("employees", count) {
            employeeRepository.streamAll().iterator()
        }
    }
    return excelGenerator.generate(template, provider)
}

Without @Transactional, you may encounter LazyInitializationException.

Wrap-up

Key takeaways:

Template design: Repeat markers outside range, formulas below, one row per data item
Performance: Provide count → lazy loading → DB streaming (3 steps)
Errors: Use MissingDataBehavior.THROW during development
Hideable: Check bundle range and field name alignment
Production: Thread-safe singleton + environment-specific config

For full details, see the GitHub documentation:

Best Practices

Troubleshooting Guide

GitHub: jogakdal/data-processors-with-excel
Maven Central: io.github.jogakdal:tbeg:1.2.3

황용호

이전 포스트